Stratosphere Summit

November 14-15th, Berlin

Discover the power of Big Data analytics with the Stratosphere system. Register now!

Schedule

Thursday, November 14

8:45-9:00Opening
9:00-10:00From an idea to a disruptive technology, Ijad Madisch, ResearchGate
10:00-10:30Coffee break
10:30-12:00The Stratosphere Platform for Data Analytics and recent updates
12:00-14:00Lunch
14:00-15:00Industry Convergence towards Data Driven Approaches - a Telco Carrier Perspective by Marten Schoenherr, Deutsche Telekom
15:00-15:30Coffee break and laptop set-up
15:30-18:30Hands-on session: Data analysis with Stratosphere.
Tip: The Recommender Stammtisch is taking place on Thursday, November 14, starting at 19:30 at the offices of ResearchGate.

Friday, November 15

9:00-10:00Beyond HDFS Storage in NoSQL Systems, Jens Dittrich, University of Saarland
10:00-10:30Coffee break
10:30-11:30Keynote by Julen Masanes, Internet Memory Research
11:30-12:00Closing remarks
12:00-14:00Poster and demo session: Research with Stratosphere. Buffet lunch will be served.

Program

From an Idea to a Disruptive Technology, Ijad Madisch (ResearchGate)

"A social network for researchers? Forget it, that’s impossible. Researchers aren’t social" – that’s what Ijad Madisch’s professor at the University in Hanover, Germany, told him when he proposed his idea for ResearchGate. Five years later over three million researchers use the network to collaborate worldwide, share results and build a reputation to find cures for diseases and solutions to the world energy crisis faster. Hear from Ijad Madisch, CEO and co-founder of ResearchGate, on how he turned a crazy idea into a disruptive technology that's changing the world of science - and why we need smarter software frameworks to drive progress.

Ijad Madisch

Dr. Ijad Madisch is the CEO and co-founder of ResearchGate. The professional network for scientists helps more than 3 million researchers worldwide to collaborate, share their results and make a name for themselves. Madisch has a background in science himself; he studied medicine in Hannover and at Harvard Medical School. Afterwards he spent a few years in Boston, working on an interdisciplinary project at the Department of Radiology at Massachusetts General Hospital. Disappointed by inefficient practices in the research world, he founded ResearchGate together with fellow physician Dr. Sören Hofmayer and IT-specialist Horst Fickenscher in 2008. Since then, ResearchGate has attracted investments from several renown venture capitalists and private investors, among them Bill Gates, Benchmark and Founders Fund. Madisch serves as the company’s President of the Board of Directors.

The Stratosphere Platform for Big Data Analytics, Kostas Tzoumas (TU Berlin)

Big Data analysis is diversifying to use cases that require low latency and go beyond functionality that naturally fits the MapReduce paradigm. We created the Stratosphere platform to support the next generation of data analysis needs. Stratosphere is a new parallel data management engine that combines a unique set of features for Big Data management: it provides a declarative language for the specification of data analysis programs, analyzes data “in situ” by embracing external data sources (including HDFS), features a rich parallelization model that goes beyond MapReduce, and deeply embeds user-defined code in the system’s query optimizer and runtime. In addition, Stratosphere efficiently executes iterative algorithms, widening the use case of such systems to fields such as Machine Learning and graph analysis. Stratosphere is compatible with the Hadoop ecosystem and is developed by three Universities in the Berlin area in Germany, together with a growing open-source developer and user community.

You can download the slides of both talks here:

Kostas Tzoumas

Dr. Kostas Tzoumas is a postdoctoral researcher at the Technical University of Berlin working on various system aspects of Big Data such as parallel data management, programming models, program optimization, system architectures, and benchmarking. He is co-leading the Stratosphere project, where a community of developers and researchers are creating a next-generation platform for Big Data analytics. Kostas received his PhD from Aalborg University, Denmark in 2011, he graduated from the National Technical University of Athens, Greece in 2007.

Stephan Ewen

Stephan Ewen is a Ph.D. student at the Berlin University of Technology. He is working on the Stratosphere project that creates a next-generation system for BigData analysis. Stephan has architeced and co-architected many components of Stratosphere, including programming abstractions, compiler and optimizer, query runtime and the support for iterative algorithms. Aside from his work on Stratosphere, Stephan was an intern in the field of data analytics at Microsoft Research, IBM Research, and IBM Germnany Development. He holds a masters degree in computer science from the University of Stuttgart.

Industry Convergence towards Data Driven Approaches - a Telco Carrier Perspective, Marten Schoenherr (Deutsche Telekom Laboratories)

Marten Schoenherr

Marten Schoenherr works for Deutsche Telekom Laboratories being in charge for scaling compute infrastructure topics and the implementation of respective use cases. Such as the deployment and usage of big data workbenches. Before that Marten spent many years as a researcher in computer science and founded two High-Tec companies.

Beyond HDFS Storage in NoSQL Systems, Jens Dittrich (Saarland University)

The distributed file system HDFS is a building block for several NoSQL systems suchs as Hadoop MapReduce, HBase, and Stratosphere. The current version of HDFS, however, is basically a block store providing replication-based failover and efficient scale-out. In this talk I will give an overview on the HDFS replacement HAIL (Hadoop Aggressive Indexing Library) developed at Saarland University. HAIL is an extension of HDFS preserving the failover and scaling properties of HDFS. However, in addition to HDFS, we add automatic index creation at data upoad already, adaptive indexing, early map phase execution, as well as storage re-balancing. All these techniques come with little to zero overhead and can be used with all software layers currently operating on HDFS. I will show experimental results demonstrating that HAIL provides for efficient index creation and query processing simultaneously boosting query times by up to a factor 60.

Jens Dittrich

Jens Dittrich is a Full Professor of Computer Science/Databases at Saarland University, Germany. Previous affiliations include U Marburg, SAP AG, and ETH Zurich. He received an Outrageous Ideas and Vision Paper Award at CIDR 2011, two CS teaching award for database systems in 2011 and 2013, as well as several presentation awards. His research focuses on fast access to big data. Since 2013 he teaches his classes on data management as inverted classrooms, see here for a list of videos.

Title TBA, Julien Masanes (Internet Memory)

Julien Masanes

Julien Masanès has been involved in web archiving for the last 10 years. He first directed the Web Archiving Project at the Bibliothèque Nationale de France from 2000 to 2004. He also actively participated in the creation of the International Internet Preservation Consortium (IIPC), which he has coordinated during the first two years. He has also launched and presently chairs the International Web Archiving Workshop (IWAW) series, the main international rendezvous in this field. Involved since the foundation of the European Archive, now Internet Memory Foundation, Julien Masanes has focused on developing a webscale archive that is both open to the large public and useful for research, gaining support from the various stakeholders in Europe (European Commission, Research labs, open access groups) to support this vision.

Poster and Demo Session: Research with Stratosphere

  • Data Integration with Stratosphere. Arvid Heise, Hasso-Plattner Institute
  • Web Data Analytics. Astrid Rheinlaender, Humbold University of Berlin
  • PAXQuery: Efficient Parallel Processing of Complex XQuery. Jesus Camacho Rodriguez, Inria Saclay
  • HDFS/YARN under Stratosphere. Jim Dowling, KTH
  • Aura - Stratosphere II Runtime Execution Engine. Tobias Herb, TU Berlin
  • The Nephele Livescale Toolkit: Real-Time Video Processing At Scale. Bjoern Lohrmann, TU Berlin
  • Iterative Parallel Data Processing with Stratosphere: An Inside Look. Stephan Ewen, TU Berlin
  • Peeking into the Optimization of Data Flow Programs with MapReduce-style UDFs. Fabian Hueske, TU Berlin
  • Stratosphere: Past, present, and future. Asterios Katsifodimos, TU Berlin

Registration

The event is free of charge. Please register here.

Location


View Larger Map

Science and Technology Park Adlershof
Max-Born-Hall
Carl-Scheele-Straße/Max-Born-Straße
12489 Berlin.

Directions

Accomodation

The following hotels are in walking distance from the event venue:

Airporthotel Berlin-Adlershof
Rudower Chaussee 14
12489 Berlin
Tel.: +49 (0) 30/ 7202222 - 000

Dorint Adlershof Berlin
Rudower Chaussee 15
12489 Berlin
Tel.: +49 30 67 822-0

#StratoSummit on Twitter