## Posts

Showing posts from 2015

### BRANE Cut: Biologically-Related Apriori Network Enhancement with Graph cuts

[ BRANE Cut featured on RNA-Seq blog ][ Omic tools ][ bioRxiv preprint ][ PubMed/Biomed Central ][ BRANE Cut code ][ BRANE Omics ] Gene regulatory networks are somehow difficult to infer. This first work from an on-going work on BRANE Omics (termed BRANE *, for B iologically R elated A priori N etwok E nhancement) introduces an optimization method (based on Graph cuts, borrowed from computer vision/image processing) to infer graphs based on biologically-related a priori (including sparsity). It is succesfully tested on DREAM challenge data and an Escherichia coli network, with a specific work to derive optimization parameters from gene network cardinality properties. And it is quite fast. BRANE Cut: Biologically-Related Apriori Network Enhancement with Graph cuts for Gene Regulatory Network inference ( doi , BRANE Cut webpage , preprint ) Background : Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulat

### Big data, fishes and cooking: fourteen shades of "V"

[At this short post, you can access the 14 "V" often glued to Bug Data, including vacuity] To Lao Tzu is often attributed (I cannot access the original meaning): Govern a great nation as you would cook a small fish. Do not overdo it. Today's wisdom could be: Deal with Big data as you would process a small signal. Do not over-expect from it, do not over-fit it, do not-overinterpret it. Luckily, Big data does not exist , where Making The Most Of Small Data is advocated. This is a bit like teenage sex: “Big Data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.” In the What exactly is Big Data StackExchange question, I have listed all 14 "V" that could describe big data, including... vacuity . They are: Validity, Value, Variability/Variance, Variety, Velocity, Veracity/Veraciousness, Viability, Virtuality, Visualization, Volatility,

### Hugo Steinhaus, or K-means clustering in French

Kernel clustering [Modern transcription of the Hugo Steinhaus paper in 1956 (in French) , at the source of k-means clustering algorithms, published first in a french-written post ] Data clustering or clustering analysis belongs to statistical data analysis methods. It aims at forming groups of objects that are similar in some way. Those groups are named clusters. The word cluster is related to clot , for thick mass of coagulated liquid or of material stuck together The whole set of objects contains heterogeneous data, that ought to be grouped into subsets possessing a greater inner homogeneity. Such methods rely on similarity criteria or proximity measures. They are related to classification, machine learning, segmentation, pattern recognition, and have applications ranging from image processing to bioinformatics. One of the most popular clustering method is known as K-means ( k-moyennes in French). with a variation called dynamic clustering (beautifully called nuée

### Hugo Steinhaus : classification par k-moyennes, nuées dynamiques

Partitionnement à noyau [Mise à disposition de l' article de Hugo Steinhaus de 1956 , à l'origine de l'algorithme de partitionnement par les k-moyennes ( available in English )] Le partitionnement des données ( data clustering ou clustering analysis ) est une méthode "statistique" d'analyse de données visant à regrouper, dans un ensemble de données hétérogènes, des sous-ensembles de ces données en amas ou paquets plus homogènes. Chaque sous-ensemble doit ainsi présenter des caractéristiques similaires, quantifiée par des critères de similarité ou différentes mesures de proximité. Ces techniques appartiennent aux familles de classification, d'apprentissage automatique ou de segmentation, employées dans un nombre phénoménal d'applications, du traitement d'image à la bio-informatique. L'une des méthodes de partitionnement ou d’agrégation les plus populaires est celle des k-moyennes (ou K-means ), un problème d'optimisation com

### Sparse seismic data restoration: a PhD defense

Smoothed $\ell_1/\ell_2$ function for a sparse $\ell_0$ surrogate Mai Quyen PHAM has defended her PhD thesis on July 15th, 2015 at 10.00 am , on the topic of " Seismic wave field restoration using sparse representations and quantitative analysis ” (manuscript in pdf), at Université Paris-Est, bâtiment Copernic, amphithéâtre Maurice Gross, 5 boulevard Descartes (RER A, Noisy-Champs), 77420 Champs-sur-Marne.  Its focus is twofold: 1) sparse adaptive filtering with approximate templates in redundant and geometric wavelet frames (akin to echo cancellation in speech), 2) sparse blind deconvolution for parsimonious reflectivity signals with l1/l2 norm ratio penalty This work has notably been published in two journal papers * Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l_1/l_2 Audrey Repetti, Mai Quyen-Pham, Laurent Duval, Émilie Chouzenoux, Jean-Christophe Pesquet IEEE Signal Processing Letters, May 2015, Volume 22, Number 5, pages 539-543. http://