October 6, 2018

A Monty Python space Odyssey

After 2001, a space odyssey, the second best movie ever could be Monty Python, the holy grail:
Mønti Pythøn ik den Hølie Gräilen Røtern nik Akten Di Wik Alsø wik Alsø alsø wik Wi nøt trei a høliday in Sweden this yër? See the løveli lakes The wøndërful telephøne system And mäni interesting furry animals

A Pink Floyd odyssey: Jupiter and beyond the infinite


September 27, 2018

BRANE power: Gene network inference with graph optimization

And the IPFEN 2018 Yves Chauvin PhD prize is awarded to Aurélie Pirayre, for IFPEN first thesis on bioinformatics with graph optimization for gene networks. In French, you can now read ‘‘BRANE Power’’ : gènes et algorithmes, une alliance pour la chimie verte.

Aurélie Pirayre, Grégoire Allaire, Didier Houssin, Pierre-Henri Bigeard, Eric Heintzé

PhD manuscript and slides for her thesis:
  • Reconstruction and clustering with graph optimization and priors on gene networks and images (manuscript)
    • Abstract: The discovery of novel gene regulatory processes improves the understanding of cell phenotypic responses to external stimuli for many biological applications, such as medicine, environment or biotechnologies. To this purpose, transcriptomic data are generated and analyzed from DNA microarrays or more recently RNAseq experiments. They consist in genetic expression level sequences obtained for all genes of a studied organism placed in different living conditions. From these data, gene regulation mechanisms can be recovered by revealing topological links encoded in graphs. In regulatory graphs, nodes correspond to genes. A link between two nodes is identified if a regulation relationship exists between the two corresponding genes. Such networks are called Gene Regulatory Networks (GRNs). Their construction as well as their analysis remain challenging despite the large number of available inference methods. In this thesis, we propose to address this network inference problem with recently developed techniques pertaining to graph optimization. Given all the pairwise gene regulation information available, we propose to determine the presence of edges in the final GRN by adopting an energy optimization formulation integrating additional constraints. Either biological (information about gene interactions) or structural (information about node connectivity) a priori have been considered to restrict the space of possible solutions. Different priors lead to different properties of the global cost function, for which various optimization strategies, either discrete and continuous, can be applied. The post-processing network refinements we designed led to computational approaches named BRANE for \Biologically-Related A priori for Network Enhancement". For each of the proposed methods --- BRANE Cut, BRANE Relax and BRANE Clust --- our contributions are threefold: a priori-based formulation, design of the optimization strategy and validation (numerical and/or biological) on benchmark datasets from DREAM4 and DREAM5 challenges showing numerical improvement reaching 20%. In a ramification of this thesis, we slide from graph inference to more generic data processing such as inverse problems. We notably invest in HOGMep, a Bayesian-based approach using a Variational Bayesian Approximation framework for its resolution. This approach allows to jointly perform reconstruction and clustering/segmentation tasks on multi-component data (for instance signals or images). Its performance in a color image deconvolution context demonstrates both quality of reconstruction and segmentation. A preliminary study in a medical data classification context linking genotype and phenotype yields promising results for forthcoming bioinformatics adaptations.
  • Slides (PhD defense on July 3rd, 2017)
are finally online (check the EURASIP Library of Ph.D. Theses). The work was ruled by the concept of BRANE power; a methodology for gene regulatory network inference and clustering based on graph optimization and biological priors. BRANE stands for Biologically Related Apriori Network Enhancement. It rhymes with cell membrane (and brain, for who it's worth).

Gene regulatory network inference with BRANE Cut

State-of-the-art results are obtained on synthetic and real transcriptomic data (DREAM-4, DREAM-5 for DREAM consortium challenges, Escherichia coli dataset). Derived methods are BRANE Cut (with graph cuts), BRANE Relax (with proximal optimization) and BRANE Clust (with graph Laplacian). 

Gene network joint inference and clustering with BRANE Clust



Used concepts include:
  • data science, optimization on graphs: maximal flow, minimum cut, random walker algorithm, variational and Bayes variational formalism, convex relaxation, alternating optimization, combinatorial Dirichlet problem, hard-clustering and soft-clustering
  • biology, biotechnology, bioinformatics: transcription factors (TFs) as regulators and non-transcription factors (TFs) as targets, modular networks, biological priors, in-silico data, second generation bio-fuel production, DREAM4 challenge, DREAM5 challenge
  • use to biofuels and green chemistry production (with fungus Trichoderma reesei)
Supervising team:
PhD Thesis reporters

PhD Thesis Examiners
More links:

September 9, 2018

Kultur Pop 44 : Brain et évolution

[Mise à jour, 29/09/2018, pour Pascale Casanova] Elle anima les mardis littéraires, les jeudis littéraires, l'atelier littéraire. Une vie n'est pas coutume, Kultur Pop ajoute un neuvième morceau, leur générique, à Kultur Pop 44, Brain. Qui était déjà dans Kultur Pop 01, il y a 11 ans déjà, en 2007. Car 3^2-2^3 = 1 est une égalité rare.
"We are using your brain's electrical system as a receiver,We are unable to transmit to your conscious neural interference"
Tandis :
le 44e volume, Brain, des génériques Kultur Pop (France Culture/France Inter, et parfois des intruses), vient (enfin) de paraître,



Au programme : Kultur Pop 2018.44 : Brain
  • France Culture, Interlude nuits : Alain Romans, Quel Temps Fait-il a Paris? (Les vacances de monsieur Hulot)
  • France Culture, Culture protestante : Ensemble Lucidarium, O prebstres, prebstres
  • France Culture,  Science publique : Brian Eno & David Byrne, The Jezebel Spirit
  • France Culture, Condordance des temps : Louis Sclavis Sextet, Charmes
  • France Culture, Interlude nuits : Alexandre Desplat, Camera Obscura (Girl With A Pearl Earring)
  • France Culture, Agora : L'Orchestre de Contrebasses, Sablier
  • France Culture, Grands reportages : Bonobo, Kerala
  • France Culture,  Culture de soi, cultures des autres : Music Ensemble of Benares, Kathak Nritya, part 1 & 2
  • Ghost track. France Culture,  Atelier littéraire, mardis littéraires, jeudis littéraires (Pascale Casanova, 29 septembre 2018) : DJ Shadow, Stem long stem

There is no hope, there's only chaos and evolution (Evereve, Fade to grey, Visage cover)


Et en même temps (c'est la mode), nous célébrons the BRANE Power, et l'évolution : Gene network inference with graph optimization

July 29, 2018

Multiscale representation of hexahedral meshes & compression

Companion pages:
A full-scale geological grid structure is decomposed onto embedded wavelet-like scales while preserving the discontinuities, here geological faults (red), using a morphological 2D wavelet:
Geological grid structures and discontinuities preservation (red painted faults)
Categorical properties like rock types (sandstone, limestone, shale)  can be upscaled according to a dedicated non-linear decomposition called modelet (patent #20170344676: Method of exploitation of hydrocarbons of an underground formation by means of optimized scaling):


Hexahedral mesh categorical property: rock type

Continuous properties (saturation, porosity, permeability, temperature) can be homogenized with a 3D Haar wavelet:

Hexahedral mesh continuous property: porosity

The HexaShrink methodology described above is detailed in the recently submitted paper: 
With huge data acquisition progresses realized in the past decades and acquisition systems now able to produce high resolution point clouds, the digitization of physical terrains becomes increasingly more precise. Such extreme quantities of generated and modeled data greatly impact computational performances on many levels: storage media, memory requirements, transfer capability, and finally simulation interactivity, necessary to exploit this instance of big data. Efficient representations and storage are thus becoming "enabling technologies" in simulation science. We propose HexaShrink, an original decomposition scheme for structured hexahedral volume meshes. The latter are used for instance in biomedical engineering, materials science, or geosciences. HexaShrink provides a comprehensive framework allowing efficient mesh visualization and storage. Its exactly reversible multiresolution decomposition yields a hierarchy of meshes of increasing levels of details, in terms of either geometry, continuous or categorical properties of cells. Starting with an overview of volume meshes compression techniques, our contribution blends coherently different multiresolution wavelet schemes. It results in a global framework preserving discontinuities (faults) across scales, implemented as a fully reversible upscaling. Experimental results are provided on meshes of varying complexity. They emphasize the consistency of the proposed representation, in terms of visualization, attribute downsampling and distribution at different resolutions. Finally, HexaShrink yields gains in storage space when combined to lossless compression techniques.
And there is a patent associated to HexaShrink, Method of exploitation of hydrocarbons of an underground formation by means of optimized scaling:

Method of exploitation of hydrocarbons of an underground formation by means of optimized scaling


July 10, 2018

Bioinformatics & datascience: Internship & PhD on multi-omics data

An PhD position is still available on Graph-based learning from integrated multi-omics and multi-species data (genomic, transcriptomic, epigenetic) between IFP Energies nouvelles and CentraleSupélec/INRIA Saclay. All the information is gathered at this address.

Some information is duplicated below:
Micro-organisms are studied here for their application to bio-based chemistry from renewable sources. Such organisms are driven by their genome expression, with very diverse mechanisms acting at various biological scales, sensitive to external conditions (nutrients, environment). The irruption of novel high-throughput experimental technologies provides complementary omics data and, therefore, a better capability for understanding for the studied biological systems. Innovative analysis methods are required for such highly integrated data. Their handling increasingly require advanced bioinformatics, data science and optimization tools to provide insights into the multi-level regulation mechanisms (Editorial: Multi-omic data integration). The main objective of this subject is to offer an improved understanding of the different regulation levels in the cell (from model organisms to Trichoderma reesei strains). The underlying prediction task requires the normalization and the integration of heterogeneous biological data (genomic, transcriptomic and epigenetic) from different microorganisms. The path chosen is that of graph modelling and network optimization techniques, allowing the combination of different natures of data, with the incorporation of biological a priori (in the line of BRANE Cut and BRANE Clust algorithms). Learning models relating genomic and transcriptomic data to epigenomic traits could be associated to network inference, source separation and clustering techniques to achieve this aim. The methodology would inherit from a wealth of techniques developed over graphs for scattered data, social networks. Attention will also be paid to novel evaluation metrics, as their standardization remains a crucial stake in bioinformatics. A preliminary internship position (summer/fall 2018) is suggested before engaging the PhD program. Information at: http://www.laurent-duval.eu/lcd-2018-intern-phd-epigenetics-omics-graph-processing.html

May 6, 2018

Hungarian Syzygies - Trauma Memorial - Werckmeister harmonies - 2001

I am sorry David, I am afraid I should do that :) This thought gathering stems from a talk with David in Budapest, Hungary, from Shakespeare's Helmet Collective. He stands in the eye of a political storm (recent hungarian elections and Viktor Orban declarations), and proposed (with a collective) a Trauma Memorial in the center of Heroes' Square (Hosök tere) in Budapest. I was lucky enough to witness it directly. It appeared as a black monolith with a video stripe:

Budapest, Heroes Square, Trauma Memorial installation around the right of freedom
Here is a story. But you can skip it directly to the video: A millenniumi emlékmu kiegészítése 100 év hordalékával. As a side note, I was happy to be for the first time in Bulgaria, home of many Bulgarian scientists mostly mathematicians (some known as the Martians), some being prominent in the history of wavelets, like Alfréd Haar or Frigyes Riesz, who were put in a multiscale perspective in a 2D wavelet panorama review paper, details below:

Haar and Riesz with multiscale wavelets

A few weeks ago, I was given the opportunity to watch Werckmeister harmóniák by  (Les Harmonies Werckmeister or Werckmeister harmonies) with a friend. He insisted that we should watch the movie, given the following pitch:
A guy in a small drunkard bar builds a choreography with the local boozers, making them reproduce the planets and satellites' motions of the solar system. In black and white. 
Werckmeister Harmonies: satellites and planets in motion
The movie was stunning, with shades, a whale and rising violence. I could not help but relating it to Stanley Kubrick's 2001: A space Odyssey (as we are celebrating its 50th anniversary) and to Ian Watson's The embedding (foreign languages and the whale). We ended this cinema show with the 1962 dystopian black-and-white short movie La jetée (The Jetty), by Chris Marker, aka Christian Bouche-Villeneuve, which was the inspiration for Terry Gilliam's Twelve monkeys army. A story about global war, time-travel, memories, love and death. It can be viewed at Vimeo: La jetée.

Chris Marker (or his Sans Soleil Hungarian avatars, Sandor and Michel Krasna, born in Kolozsvar, 1932 and Budapest, 1946, respectively) is currently subject to a retrospective at La Cinémathèque in Paris: Chris Marker, les sept vies des cinéastes (3 mai/29 juillet 2018).

Then, I was in Budapest in April 2018, for a too short week-end. As I am an obsessed 2001 fan, reminiscences from 2001 were evident everywhere, either in Budapest's magnificent parliament, the Vasarely museum or the mere streets of Budapest.

2001 space odyssey reminiscence from Budapest
So I went to Hosök tere, a beautiful square with monuments celebrating the Magyar historical background. And right in the center of the square, an installation displayed an intriguing video and sound on one side of a black cube:

Trauma memorial, pixels in a silhouette
This marked a Hungarian syzygy, a connection of seemingly unrelated events. The video displayed the upper half of a dark suit, uttering speeches I could not understand. Of course, Hungarian is known as a special language, in the Uralic-Finno-Ugric family. Funnily, the subtitles were in esoteric ASCII characters. But after a few seconds, it became clear that the sounds were reversed, spoken backwards. Apart from causality issues, I should confess that backwards or forwards, Hungarian remains foreign to me. So hopping inside the cube, one could be welcome by its "kind wards".
Our insatiable stomach
This picture summarizes a state of affairs: a rising tide of nationalism, autocratic power, growing on sedimented ancient trauma and more recent angers and fears (as far as I can understand). The above "Our insatiable stomach" is a timely snapshot with those close sounding of Hungary and Hungry sounds. So here is it, an mere addendum to this black blocky sedimentation of history, cast reverse:  A millenniumi emlékmu kiegészítése 100 év hordalékával. With fun: this  Shakespeare's Helmet Collective work is curated by... Byron (for those who have an eye for finest details)


Links:

January 15, 2018

Theories of Deep Learning, videos and slides

With a little sense of provocations carried by the poster, Stanford university STATS 385 (Fall 2017) proposes a series of talks on the Theories of Deep Learning, with deep learning videos, lecture slides, and a  cheat sheet (stuff that everyone needs to know).  Outside the yellow submarine, Nemo-like sea creatures depict Fei-Fei Li, Yoshua Bengio, Geoffrey Hinton, Yann LeCun on a Deep dream background. So, wrapping up stuff about CNN (convolutional neural networks):

The spectacular recent successes of deep learning are purely empirical. Nevertheless intellectuals always try to explain important developments theoretically. In this literature course we will review recent work of Bruna and Mallat, Mhaskar and Poggio, Papyan and Elad, Bolcskei and co-authors, Baraniuk and co-authors, and others, seeking to build theoretical frameworks deriving deep networks as consequences. After initial background lectures, we will have some of the authors presenting lectures on specific papers. This course meets once weekly.
Videos and slides are gathered at follows.
  1. Theories of Deep Learning, Lecture 01: Deep Learning Challenge. Is There Theory? (Donoho/Monajemi/Papyan) : video, slides
  2. Theories of Deep Learning, Lecture 02: Overview of Deep Learning From a Practical Point of View (Donoho/Monajemi/Papyan) : video, slides
  3. Theories of Deep Learning, Lecture 03: Harmonic Analysis of Deep Convolutional Neural Networks (Helmut Bolcskei) : video, slides
  4. Theories of Deep Learning, Lecture 04: Convnets from First Principles: Generative Models, Dynamic Programming & EM (Ankit Patel) : videoslides
  5. Theories of Deep Learning, Lecture 05: When Can Deep Networks Avoid the Curse of Dimensionality and Other Theoretical Puzzles (Tomaso Poggio) : videoslides
  6. Theories of Deep Learning, Lecture 06: Views of Deep Networks from Reproducing Kernel Hilbert Spaces (Zaid Harchaoui) : videoslides
  7. Theories of Deep Learning, Lecture 07: Understanding and Improving Deep Learning With Random Matrix Theory (Jeffrey Pennington) : videoslides
  8. Theories of Deep Learning, Lecture 08: Topology and Geometry of Half-Rectified Network Optimization (Joan Bruna) : videoslides
  9. Theories of Deep Learning, Lecture 09: What’s Missing from Deep Learning? (Bruno Olshausen) : videoslides
  10. Theories of Deep Learning, Lecture 10: Convolutional Neural Networks in View of Sparse Coding (Vardan Papyan and David Donoho) : videoslides