Cutty: Aggregate Sharing for User-defined Windows
Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Markl
CIKM 2016
  
 
Bridging the Gap: Towards Optimizations across Linear and Relational Algebra
Andreas Kunft, Alexander Alexandrov, Asterios Katsifodimos, Volker Markl.
BeyondMR 2016, ACM SIGMOD Workshop 2016
  RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets
Sebastian Kruse, Anja Jentzsch, Thorsten Papenbrock, Zoi Kaoudi, Jorge Arnulfo Quiané-Ruiz Felix Naumann
SIGMOD 2016
  
		Potential and Pitfalls of Domain-Specific Information Extraction at Web Scale
Astrid Rheinländer, Mario Lehmann, Anja Kunkel, Jörg Meier, Ulf Leser
SIGMOD 2016
		[PDF]
	
		Implicit Parallelism through Deep Language Embedding
Alexander Alexandrov, Asterios Katsifodimos, Georgi Krastev, Volker Markl
SIGMOD Record, March 2016
		[PDF]
	
		Emma in Action: Declarative Dataflows for Scalable Data Analysis
Alexander Alexandrov, Andreas Salzmann, Georgi Krastev, Asterios Katsifodimos, Volker Markl
Demo at SIGMOD 2016
		[PDF]
	
	
Apache Flink: Stream and Batch Processing in a Single Engine
Paris Carbone, Stephan Ewen, Seif Haridi, Asterios Katsifodimos, Volker Markl, Kostas Tzoumas
IEEE Data Engineering Bulletin, in the special issue on Next-gen Stream Processing (December 2015, Vol. 38 No. 4)
		Elastic Stream Processing with Latency Guarantees
Björn Lohrmann, Peter Janacik, Odej Kao
ICDCS 2015
	
		Elastic Stream Processing with Latency Guarantees
Björn Lohrmann, Peter Janacik, Odej Kao
ICDCS 2015
	
		SOFA: An Extensible Logical Optimizer for UDF-heavy Data Flows
Astrid Rheinländer, Arvid Heise, Fabian Hueske, Ulf Leser, Felix Naumann
Information Systems, Elsevier, 2015
	
		Implicit Parallelism through Deep Language Embedding
Alexander Alexandrov, Andreas Kunft, Asterios Katsifodimos, Felix Schüler, Lauritz Thamsen, Odej Kao, Tobias Herb, Volker Markl
SIGMOD 2015
		[PDF]
	
		Optimistic Recovery For Iterative Dataflows in Action
Sergey Dudoladov, Chen Xu, Sebastian Schelter, Asterios Katsifodimos, Stephan Ewen, Kostas Tzoumas, Volker Markl
Demo at SIGMOD 2015
	
Scaling Out the Discovery of Inclusion Dependencies
Sebastian Kruse, Thorsten Papenbrock, Felix Naumann
BTW 2015
    [PDF]
	
Estimating the Number and Sizes of Fuzzy-Duplicate Clusters
Arvid Heise, Gjergji Kasneci, Felix Naumann
CIKM 2014
Runtime Analysis of Distributed Data Processing Programs
Marcus Leich (Advisor: Volker Markl)
PhD Workshop at VLDB 2014  - received Best Paper Award
[PDF]
Versatile optimization of UDF-heavy data flows with Sofa
Astrid Rheinländer, Martin Beckmann, Anja Kunkel, Arvid Heise, Thomas Stoltmann, and Ulf Leser
Demo at SIGMOD 2014 
[Link]
The Stratosphere platform for Big Data Analytics
Alexander Alexandrov , Rico Bergmann , Stephan Ewen , Johann-Christoph Freytag ,
Fabian Hueske , Arvid Heise , Odej Kao , Marcus Leich , Ulf Leser , Volker Markl ,
Felix Naumann , Mathias Peters , Astrid Rheinländer , Matthias J. Sax , Sebastian Schelter ,
Mareike Höger , Kostas Tzoumas , Daniel Warneke
 VLDB Journal 2014
		[PDF]
		"All Roads Lead to Rome:" Optimistic Recovery for Distributed Iterative Data Processing
Sebastian Schelter, Stephan Ewen, Kostas Tzoumas, Volker Markl
CIKM, 2013
		[PDF]
		
	
		Nephele Streaming: Stream Processing Under QoS Constraints at Scale
 Björn Lohrmann, Daniel Warneke, Odej Kao
Journal of Cluster Computing, Springer US, 2013
		[Link]
		[Pre-Print PDF]
	
		Adaptive Online Compression in Clouds—Making Informed Decisions in Virtual Machine Environments
 Matthias Hovestadt, Odej Kao, Andreas Kliem, Daniel Warneke
Journal of Grid Computing, Springer, 2013
		[Link]
	
		Large-Scale Social-Media Analytics on Stratosphere
 Christoph Boden, Marcel Karnstedt, Miriam Fernandez, Volker Markl
WWW, 2013 
	
		Iterative Parallel Data Processing with Stratosphere: An Inside Look
 Stephan Ewen, Sebastian Schelter, Kostas Tzoumas, Daniel Warneke, Volker Markl
SIGMOD, 2013
	
		Peeking into the Optimization of Data Flow Programs with MapReduce-style UDFs
 Fabian Hueske, Mathias Peters, Aljoscha Krettek, Matthias Ringwald, Kostas Tzoumas, Volker Markl, Johann-Christoph Freytag
ICDE, 2013
		[PDF]
		[Poster PDF]
		[Video]
	
		Applying Stratosphere for Big Data Analytics
 Marcus Leich, Jochen Adamek, Moritz Schubotz, Arvid Heise, Astrid Rheinländer, Volker Markl
BTW, 2013 (Demo)
		[PDF]
	
		Meteor/Sopremo: An Extensible Query Language and Operator Model
 Arvid Heise, Astrid Rheinländer, Marcus Leich, Ulf Leser, and Felix Naumann
 BigData Workshop (2012), affiliated with VLDB
		[PDF]
	
		Spinning Fast Iterative Data Flows
Stephan Ewen, Moritz Kaufmann, Kostas Tzoumas, Volker Markl
PVLDB 5(11), 2012, pp. 1268-1279
		[PDF]
		[DOI]
	
		Opening the Black Boxes in Data Flow Optimization
Fabian Hueske, Mathias Peters, Matthias J. Sax, Astrid Rheinländer, Rico Bergmann, Aljoscha Krettek, Kostas Tzoumas
PVLDB 5(11), 2012: pp. 1256-1267
		[PDF]
		[DOI]
	
		Myriad: Scalable and Expressive Data Generation
Alexander Alexandrov, Kostas Tzoumas, Volker Markl
PVLDB, 5(12), 2012: pp. 1890-1893 
		[PDF]
		[DOI]
	
		Massively-Parallel Stream Processing under QoS Constraints with Nephele
Björn Lohrmann, Daniel Warneke, Odej Kao
Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2012 ACM, pp. 271-282
		[PDF]
		 
		[BibTex]
		 
		[DOI]
	
		Enabling Operator Reordering in Data Flow Programs Through Static Code Analysis
Fabian Hueske, Aljoscha Krettek, Kostas Tzoumas
XLDI Workshop (2012), affiliated with ICFP
		[PDF]
	
		MapReduce and PACT - Comparing Data Parallel Programming Models 
 Alexander Alexandrov, Stephan Ewen, Max Heimel, Fabian Hueske, Odej Kao, Volker Markl, Erik Nijkamp, Daniel Warneke 
 In Proceedings of Datenbanksysteme für Business, Technologie und Web (BTW) 2011, pp. 25-44 
		[PDF]
		 
		[BibTex]
	
		Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud 
 Daniel Warneke, Odej Kao 
 In Journal IEEE Transactions on Parallel and Distributed Systems (TPDS), Special Issue on Many-Task Computing, 2011, pp. 985-997 
		[PDF]
		 
		[BibTex]
		 
		[DOI]
	
		Evaluating Adaptive Compression to Mitigate the Effects of Shared I/O in Clouds 
 Matthias Hovestadt, Odej Kao, Andreas Kliem, Daniel Warneke 
 Proceedings of the 1st International Workshop on Data Intensive Computing in the Clouds (DataCloud), 2011 
		[PDF]
		 
		[BibTex]
		 
		[DOI]
	
		Evaluation of Network Topology Inference in Opaque Compute Clouds Through End-to-End Measurements 
 Dominic Battré, Natalia Frejnik, Siddhant Goel, Odej Kao, Daniel Warneke 
 Proceedings of the 4th IEEE International Conference on Cloud Computing (IEEE CLOUD), 2011 
		[PDF]
		 
		[BibTex]
		 
		[DOI]
	
		Inferring Network Topologies in Infrastructure as a Service Clouds 
 Dominic Battré, Natalia Frejnik, Siddhant Goel, Odej Kao, Daniel Warneke 
 Proceedings of the 11th International Symposium on Cluster, Cloud, and Grid computing (CCGrid), 2011, pp. 604-605 
		[PDF]
		[BibTex]
		[DOI]
	
		Myriad - Parallel Data Generation on Shared-Nothing Architectures 
Alexander S. Alexandrov, Berni Schiefer, John Poelman, Stephan Ewen, Thomas Bodner, Volker Markl 
Proceedings of the First Workshop on Architectures and Systems for Big Data (ASBD), 2011
		
			
[PDF]
		
		[BibTex]
	
		Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing 
 Dominic Battré, Stephan Ewen, Fabian Hueske, Odej Kao, Volker Markl, and Daniel Warneke 
 In Proceedings of the ACM Symposium on Cloud Computing (SoCC) 2010 ACM, pp. 119–130
		
			
		
		[PDF]
		[BibTex]
		[DOI]
	
		Massively Parallel Data Analysis with PACTs on Nephele 
 Alexander Alexandrov, Dominic Battré, Stephan Ewen, Max Heimel, Fabian Hueske, Odej Kao, Volker Markl, Erik Nijkamp, Daniel Warneke 
 PVLDB Vol. 3, No. 2, 2010, pp. 1625–1628 
		[PDF]
		[Poster PDF]
		[BibTex]
	
		Detecting Bottlenecks in Parallel DAG-based Data Flow Programs 
 Dominic Battré, Matthias Hovestadt, Björn Lohrmann, Alexander Stanik, Daniel Warneke 
 In Proceedings of Many-Task Computing on Grids and Supercomputers (MTAGS) 2010, pp. 1–10 
		[PDF]
		[BibTex]
		[DOI]
	
		Nephele: Efficient Parallel Data Processing in the Cloud 
 Daniel Warneke and Odej Kao 
 In Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers, MTAGS 2009 
		[PDF]
		[BibTex]
		[DOI]