Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

Title: BIology Genetics and Statistics

De
43 pages

  • dissertation


Project Proposal Title: BIology, Genetics and Statistics Acronym: BIGS Scientific leader: Samy Tindel Permanent responsible: Pierre Vallois Proposed INRIA theme: Biological Systems Keywords: Data Analysis, Support Vector Machines, Nonparametric Statis- tics, Regression models, Stochastic differential equations, Likelihood estima- tion.

  • convergence obtained

  • mechanistic models

  • research direction

  • complex situations

  • biological systems

  • noisy differential

  • based approach

  • monte carlo methods

  • local regression techniques

  • team


Voir plus Voir moins
ProjectProposalTitle:BIology,GeneticsandStatisticsAcronym:BIGSScientificleader:SamyTindelPermanentresponsible:PierreValloisProposedINRIAtheme:BiologicalSystemsKeywords:DataAnalysis,SupportVectorMachines,NonparametricStatis-tics,Regressionmodels,Stochasticdifferentialequations,Likelihoodestima-.noit
2Contents1.Project-teamcomposition2.Overallobjectives3.Applicationdomains3.1.Statisticalanalysisofhighdimensionaldata3.2.EstimationforComplexandBiologicalSystems3.3.Natureofthedata4.Stateoftheart4.1.Statisticalanalysisofhighdimensionaldata4.2.EstimationforComplexandBiologicalSystems5.Scienticfoundations5.1.Statisticalanalysisofhighdimensionaldata5.2.EstimationforComplexandBiologicalSystems6.Researchdirections6.1.Statisticalanalysisofhighdimensionaldata6.2.EstimationforComplexandBiologicalSystems7.Software8.Expectedresultsandcriteriaofsuccess9.Project-teampositioning9.1.PositioningwithrespecttootherINRIAgroups9.2.Researchgroupsinrelatedareas10.Collaborations10.1.CollaborationswithotherINRIAprojectteams10.2.CollaborationswithotherFrenchresearchgroups10.3.Collaborationswithforeignresearchgroups10.4.IndustrialCollaborationsProject-teambibliographyBooks,Monographs,andbookchaptersDoctoraldissertationsand“Habilitation”thesesArticlesininternationaljournalsPublicationsininternationalconferencesandworkshopsArticlesinnationaljournalsPublicationsinnationalconferencesandworkshopsInternalReportsAnnexA.PermanentmemberrésuméReferences33667899011121415161810202222232522552626262627272132333333393
1.Project-teamcompositionListthemembersoftheproposedteam,includingthecurrentPhDstudents.StaffMembersThierryBastogne(AssistantProfessor,UniversityofNancy1andCRAN,50%)SandieFerrigno(AssistantProfessor,INPLandIECN)CélineLacaux(AssistantProfessor,INPLandIECN,50%)Jean-MarieMonnez(Professor,UniversityofNancy2andIECN)AurélieMuller(AssistantProfessor,INPLandIECN)SamyTindel(Professor,UniversityofNancy1andIECN)PierreVallois(Professor,UniversityofNancy1andIECN)SophieWantz(AssistantProfessor,UniversityofNancy2andIECN,50%)PhDstudentsAurélienDeya(IECN)RoukayaKeinj(CRANandIECN)RémiBonidal(IECNandLORIA)32.OverallobjectivesDescribetheoverallobjectivesandthemainresearchdirectionsoftheproject.Thissectionshouldallowthereadertohaveafairideaoftheprojectatoneglance.TheBiostatisticsactivitiesattheInstitutElieCartan(IECN)beganfouryearsago,whensomecontactsbetweenourProbabilityteamandsomespecialistsofotherareas,suchasMedicine,Biophysics,AutomaticsorAgronomy,wereestablished.Thesecontacts,supportedbysomeconcreteappliedprojects,gaveraisequicklytoamaindrivingdirectioninourinvestigations,whichconsistsinanalyzinghighdimensionaldata,withaspecialviewtowardsmodelingandsimulatingbiologicalphenomenon.Interestinglyenough,thiskindofanalysisrequiressomenewmathematicaltechniques,comingfromawiderangeofdomains,likeMonte-Carlomethods,classicalandbayesianstatistics,Markovmodels,orconvergenceandinferenceforstochasticprocesses.Itisthusimportanttoemphasizethefactthatalargespectrumoftoolswillberequiredinordertoanalyzehighdimensionaldatainourcontext.Thisismainlybecausethecurrentprojecthastobeseenasaninitialimpulse,whichshouldlaunchanewdynamicsinbiostatisticsatNancy,andhopefullyintheEastofFranceaswell.Wewishtoachievethisimpulse:yb(1)FederatingtheenergiesofmostoftheresearchersinpureandappliedstatisticsinNancy.(2)Startingagoodnumberofinterestingprojects,goingfromaconsequenttheoreticalbackgroundtotherealworldapplicationsinBiology,GeneticsorMedicine.Thisapplicationorientedpointofview,alwaysfocusedonbiologicsystems,forcesustohaveagrasponmanydifferenttechniques.Weclaimhoweverthatthecompositionofourteam,madeofcompetentprobabilistsandstatisticianscomingfromdifferentareas,allowsustofacethischallengingsituation.Weshalltrytosummarizenowthemaindirections,whichwillleadourresearchinBiostatisticsduringthenextyears.
4(1)Statisticalanalysisofhighdimensionaldata:Cancerdetectionprogramsandotherapplicationsingeneticshaveimpliedagreatdemandinanalysisofsmallsamplesofhighdimensionaldata(typically,adatacanbeavectorinRd,withdoforder104,whilethesizeofthesampleisn,withnoforder102).Ourteamwishestomakesomestepsinthisdirection,andweshallfocusmainlyonthefollowingtopics:OnlineFactorialAnalysis:Highdimensionaldataareoftenobtainedonline,andcannotbestoredintegrallyinacomputermemory.Oneoftherecentchallengesindataanalysisisthentobeabletoperformanaccurateclassificationbytakingadvantageofthepossibilityofupdatingourinformation.Thisshouldbedone,ofcourse,inarathersimpleandefficientway,allowingrealtimeanalysis.Weplantousetechniquesbasedonsomesophisticatedtoolscomingfromstochasticapproximationinordertocompletethiskindofprogram.ProbabilisticAnalysisofSupportVectorMachines:Stillinthecontextofclassification,wealsowishtoanalyze,ataprobabilisticlevel,theconvergencerateofsomeestimatedboundaryformulticlassSupportVectorMachinealgorithms.Thiskindofanalysishasalreadybeencarriedoutforthecaseoftwoclassestobeseparated,andreliesoncon-centrationinequalities,basicallydistributionfree.Thegeneralizationtothemulticlasscaseisnon-trivial,andhasbeenalreadystudiedbyYannGuermeur(LORIA).Weshallintroducesomeprobabilistictoolsinordertoimprovetheratesofconvergenceobtainedforsuchalgorithms.WealsoplantoperformanextensivecomparisonofSVMtechniqueswithotherclassifiactionmethods,atleastinsomeparticularcasesofunderlyingdistribu-tions,likeGaussianmixtures.Thisverynaturalquestion,whicharisesinmanypracticalstudies,isstrangelyhardlyaddressedfromatheoreticalpointofview.Localregressiontechniques:Ourpreviousitemoncomparisonbetweenclassificationmethodsiscloselylinked,ascanbeeasilyunderstoodfromthelectureof[31],tolocalregressiontechniques.Ourteamalsoplanstocontributeactivelytothisfield,whichisobviouslyinterestinginitsownright.Indeed,weshallstudyofaglobalgoodnessoffittestforstatisticalmodels:ourtestwillbeabletoassess,inquiteageneralframework,whetheragivenmodelfitsadatasetregardingmostassumptionsmadeinelaboratingthemodel.ItwillbebasedonageneralizationoftheCramer-VonMisesstatisticandinvolvesanonparametricestimateoftheconditionaldistributionoftheresponsevariable.Thiskindofstrategy,whichfitsintotheglobalpictureofstatisticalanalysisofhighdimensionaldata,willbeimplementedforastudyconcerningfetalbiometry.(2)EstimationforComplexandBiologicalSystems:Thankstoourcontactswithpractitionersworkingonbiologicalsystems,ourteamhasbeguntogetfamiliarizedwithsomeestimationproblemsindifferentcomplexsituations.Asweshallsee,theseproblemswillalsobeaninvitationtoanalyzedeeplysomegeneralclassesofnoisydifferentialsystemsdrivenbyBrownianmotionofmoregeneralfamiliesofstochasticprocesses,suchasfractionalBrownianmotions.Thisleadsustoanimportantscientificaimofourproject,whichconsistsinathoroughstudyofasymptotic,numericalandstatisticalpropertiesofBrownianorfractionaldifferentialsystems.Tobemorespecific,hereisalistofsomeprojectsonwhichwewillpayaspecialattention:Photodynamictherapy:Since1988,somecontrolsystemscientistsandbiologistsattheCentredeRechercheenAutomatiquedeNancy(CRANinshort)haveworkedtogetherto
Un pour Un
Permettre à tous d'accéder à la lecture
Pour chaque accès à la bibliothèque, YouScribe donne un accès à une personne dans le besoin