Colliding galaxies – Alzheimer’s and Big Data

arp274_hst_big

Big Data: Could It Ever Cure Alzheimer’s Disease?

http://www.medscape.com/viewarticle/833784 I was struck by this article by Masud Husain (closed access in BRAIN so you cannot read the original) which is reprinted on medcsape.

The article nudged me to write a post because it reflects the challenge and opportunities created by two behemoths which like galaxies, are slowly colliding.  Its addressing an area of growing interest because organisations are waking up to the value of having information that comes from existing datasets, generating targeted data, and looking within it to drive insight, rather than establishing an hypothesis and finding data to support or refute it.

“From business to government, many have been seduced by the possibilities that Big Data seems to offer for a whole range of problems.”

Alzhiemer’s (AD) has been accreting datasets and large scale studies. Projects such as Alzheimer’s Disease Neuroimaging Initiative (ADNI) ($200M invested so far) is sequencing several hundred full human genome sequences of patients with AD. At the same time, supported by the cure Alzheimer’s foundation, our group at Harvard, at Massachusetts General Hospital and here in UK at the Sheffield Institute for Translational Nuerosciences (SITraN) is analysing a set of 1500 whole genome sequences of AD sufferers. Our group is amongst the first laboratories world wide to have undertaken a study of this magnitude, and that we have done so outside of the domains of the current academic sequencing centres is difficult for funding agencies and patients alike to comprehend. Given this relatively pioneering approach, it has take a lot of time to develop the infrastructure and analytical capacity to address the magnitude of the study we have undertaken. We are learning the hard way that datasets of this size are at best unwieldy. The compute resources alone have taken a year to master and apply, yielding the variations in each genome that seem to be more frequent in patients that have the disease.

In parallel, Schizophrenia studies, like the  Psychiatric Genomics Consortium (PGC) boast 123,000 samples from people with a diagnosis of schizophrenia, bipolar disorder, ADHD, or autism and 80,000 controls collected by over 300 scientists from 80 institutions in 20 countries. Given the magnitude and complexity of these projects, it fast becomes clear that collaboration, data sharing and internal communication are powerful components i.e.: drivers of success in contrast to traditional innovation, insight and raw scientific discovery.

Other diseases are by no means on the sidelines. Although relatively rare, the debilitating and lethal neurodegenerative disorder Amytrophic Lateral Sclerosis is also on the “galactic plane”. Combining resources at a global scale, Project Mine has generated over 5000 full genome sequences with a goal of completing another 10 000 within a year. Whole countries are sequencing their populations. Here in UK the plan is to complete 100 000 genomes by 2017. In Qatar they will sequence 300 000, in USA the plan is to complete 1 million peoples whole genomes. The tiniest nations are also on the plane – even the Faero Islands plan to complete 50 000 subjects and Iceland has just published the first 2000 whole genome sequences in their population. Although this sounds like a great deal, the means to adequately process and analyse these, and other large scale datasets is in its infancy. How can we analyse all this data? One way is the obvious route of training and education. We are part of a new National programme to establish graduate training in genome medicine. Offered here at Sheffield the MSc makes a solid step towards beginning to understand the use of genome data for health.

“The critical intersection of Genomics Big Data Medicine, delving into ‘bleeding edge’ technology & approaches that will deeply shape the future.”

Also here in UK we have been building groups across institutions so that we can collaborate to analyse and handle big health data. Later today I meet with representatives across Sheffield that will become part of a “Health North’ initiative that looks to combine de-identified, consented, health and environmental data across cities so that we can ultimately engage in actioning new forms of health data analysis.

In my view, Eric Shcadt currently leads the new field at the intersection between big data and genomics and medicine – at least in terms of vision. He has driven the development of multi-scale biological research projects that have captured thousands of genomes, clinical records, related datasets and drug profiles to launch a new form of highly networked big data medicine. The first really broadly accessible application of this is will be the launch of a new app together with Apple’s health ecosystem  Apple ResearchKit  that will help doctors interpret medical data on an iPhone. What data is that? Simply put its your lifestyle – how many steps you take, how many stairs you climb, your blood pressure, blood oxygen, when and where. Ultimately combining that with genomics and other health data means that apps in the future could have the potential to truly and effectively predict when you and you alone are most likely to die. Schadt calls his adventure the ‘death app’ – not a name that is likely to live long.

Posted in Uncategorized | 1 Comment

Why I left Harvard and went to Sheffield. The reveal.

Why I left Harvard and went to Sheffield. The reveal..

Posted in Uncategorized | Leave a comment

Why I left Harvard and went to Sheffield. The reveal.

I explain why I left Harvard in a talk: Inaugural Lecture

Breaking the Human Genome Code – Opening Pandora’s Box?

At the end of the talk (min 47:00)  there is a reveal as to why I recently left Harvard to move to Sheffield.

In a nutshell this is a place where the community is poised, capable and ready to deliver on key aspects of precision medicine. I think that here, it’s ready to bring all the moving parts together to leverage the #100KgP UK 100K genomes project, the NHS, the superb research and clinical data environment, and the highly motivated and focus funded research and research management community. Here we have machine learning, computational modelling, computational biology, clinical records, patients, physicians, basic scientists, all focused on key aspects of the diseases we action.

This is a place I respect, and where I am respected. I am able to impact health directly here as a focused force. Harvard has aspects of what I describe in droves. Its just that here I can work directly with a highly specialised team of collaborators with direct engagement across social sciences, healthcare, data science, government strategy and institutional development.

I’m looking for folks to join the new centre for genome translation we are building here. Doc? Programmer? Research Scientist? Think about a future with us.

Posted in Uncategorized | 1 Comment

Why I left Harvard – and went to Sheffield

Why I left Harvard – and went to Sheffield.

Posted in Uncategorized | Leave a comment

Human genome has gone from What, to Where and When

Today’s Nature has published our article that describes a comprehensive, detailed map of the way genes are active across the major cells and tissues of the human body.

The findings describe the complex networks that govern gene activity – detailing each promotor for each gene, and showing how it is active in each cell type. In parallel we’ve developed a program (CAGExploreR) that allows the detection of how promotor use changes as you go from each type of cell to each other type of cell. This is the clearest picture yet of how human genes are regulated in the vast array of cell types in the body – work that should help people target genes linked to disease.

This means that we can see in detail exactly where genes are initiating their activity in each major cell type in the body. Now we know where to look for genes that may be related to disease in for instance, dendrites, neurons, macrophages, skin cells …

The release is here and the article is here

 

Posted in bioinformatics, Data sharing, expression atlas, Gene regulation, genome data sharing, Genomics, networks, open access, pathways, stem cell bioinformatics, Uncategorized | Leave a comment

‘Occupy Science': Sage Commons Congress marching to the transformation of attitude to biomedical research

Working on a project with an incredible sense of joy – imagine that?

Bioscience? Well this is a battle of attitude – share your data, attribute contribution by DOI links to the data you deposit, share your methods in real time, attribute the authors through provenance of their contributions, and publish as an ensemble with the software platform supporting interaction and editing of the data you have developed. Its Science Social Innovation writ large.

Real-time drug responses from Citizens? Patients taking control of their own data?

This congress – an eclectic collection of thought leaders, TED style talks and actual hard core biomedical research meeting around a virtual fireplace of a software open data sharing system.

As I write, the founder of Red Hat, Bob Young, talks to us about ‘make things for what people need, not what they want‘.

Most impact is coming from actual studies where crowdsourcing of a problem within a commons results in a spectacular, efficient, result. The Breast Cancer Dream Challenge.

Most useful outcome is the Commons environment “synapse” – where researchers can develop shared systems approaches to interpretation of biomedical phenotype/genomic data on a common platform, using common tools, with remarkably, provenance on the methods and data. “Collaborate for the cure” is the motto, but it reminds me of a BBQ meet so perhaps they should change that.

Health activism – Joep Lange described how he worked with world organisations and pharmcos to make drugs affordable for HIV. That’s now morphing into making drugs affordable for chronic complex diseases in low income countries – where the diseases are most prevalent – and its where most people are sick but have no funds for drugs. I met several groups working on this problem – something I’m asking philanthropic organizations to consider more seriously.

Take home: Disease philanthropic organisations such as the National Brain Tumor Society now want to actively support systems biology approaches to understanding diseases such as glioblastoma – hey – this is great. 

I love this atmosphere. Highlight for me, was a talk by the CEO of Al Jazeera,  Wadah Khanfar, showing us that depth in journalism, and sincerity is of the real value we need:

Impact of conventional wisdom is to rot the soul. Acknowledge the voice of the youth who really know the news

This has been an eye opening experience – I know my science will change from here on. 

Join me

#sagecon

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Commons sharing – buzz at this years Translational Bioinformatics Conference

Our work on a stem cell commons – an as-open-as-you-choose sharing data system for placing stem cell molecular and experimental information into context was shared with the community by Shannan Ho Sui – the program director- at the TBI last week. Tweets #TBI2013 covered it – and tweeted us nicely as they say. Why? because its hard to bring together researchers, their NGS data, the molecular profiles that result and then to combine it in order to find the underlying shared function and meaning and shared interests. The Harvard Stem Cell Institute Stem Cell Commons plans to do just that. Find our presentation here. Genomeweb shout out here.

Posted in Uncategorized | Tagged , , , | Leave a comment