Home » Blog

Blog

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and Semantics

  • Posted on: 8 February 2017
  • By: Adrian Diaz

Juan Sequeda, Co-founder of Capsenta, gave an interesting talk on how can we integrate data using graphs and semantics (semantic data virtualization). As Mr. Sequeda said, the idea is to integrate data without needing to move it around. Juan started off his presentation talking about the huge gap that exists between the IT departments, guardians of the data and the business development departments, trying to extract insights about the data. He used a clear example to illustrate this gap:

8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph Data Models.

  • Posted on: 7 February 2017
  • By: Adrian Diaz

During the 8th TUC Meeting held at Oracle’s facilities in Redwood City, California, Zhe Wu, Software Architect at Oracle Spatial and Graph, explained how is his team trying to bridge RDF Graph and Property Data Models.

After making a brief overview about what is a graph he presented Oracle’s Graph strategy, they basically treat graphs as another data type on every platform (Hadoop, Oracle’s own database and, of course, in the Cloud). He also explained that his team is developing in 3 directions at the same time:

8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine

  • Posted on: 3 February 2017
  • By: Adrian Diaz

Yinlong started his talk with an introduction of his new position at Huawei, what is the company doing and more specifically how is it involved with Big Data Research and graphs. He also explained that his research center is currently working on Big Data Analytics and Management from 4 sides: Natural Language Processing, Graph analyrics, Machine Learning and Deep Learning. His team at the same time, focuses on 4 market segments that include financial graph analytics, consumer data gathered from smartphones and other portable devices, telecommunications and cloud technology.

8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data and workload generation for graph databases

  • Posted on: 31 January 2017
  • By: Adrian Diaz

George Fletcher, Associate Professor at the Eindhoven University of Technology, presented gMark, an open-source framework for generating synthetic graph instances and workloads. The main focus of gMark has been to tailor different graph data management scenarios, often driven by query workloads. Such as multi-query optimization, workload-driven graph database physical design or mapping discovery and query rewriting data integration systems.

8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark, Business Intelligence Workload

  • Posted on: 27 January 2017
  • By: Adrian Diaz

Marcus Paradies, Software developer at SAP extended the talk Arnau Prat gave about the SNB, in this case about the Intelligence workload. In contrast with the 17+4 queries the Interactive workload has, the Business Intelligence (BI) workload consists on 24 queries that can be seen as OLAP-style against the OLTP-style of the Interactive one. The BI focuses on analytic queries and they touch the whole graph.

8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edge graphs.

  • Posted on: 24 January 2017
  • By: Adrian Diaz

Sergey Edunov, Software Engineer at Facebook gave a great talk on how and why his company generating large-scale social graphs. The underlying reasons to start such an ambitious project are capacity planning to make sure that their system will be able to handle a graph that keeps growing year after year and fair evaluation of their system against the ones being implemented by other companies.

8th TUC Meeting - Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Graphs.

  • Posted on: 23 January 2017
  • By: Adrian Diaz

Weining Qian, professor at East China Normal University presented his talk on Statistical Characteristics of Real-Life Knowledge graphs during the 8th TUC Meeting held at Oracle’s facilities in Redwood City, California.

Qian explained that term knowledge graph was introduced by Google in 2012 and it has been an evolution of the semantic web. Professor Qian then introduced the main question of his talk: how can we efficiently manage knowledge graphs? Are the existing benchmarks sufficient to test them since most of these benchmarks focus only on Social Networks?

8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status

  • Posted on: 19 January 2017
  • By: Adrian Diaz

Peter Boncz, Research Scientist at the Centrum Wiskunde & Informatica in the Netherlands, talked about the updates on the Graph Query Language Task Force after being alive for a year. This Task Force was created to answer an issue detected during the benchmark meetings, all the workload is created in English text because there is no common graph query language.

8th TUC Meeting - Lijun Chang (University of New South Wales). Efficient Subgraph Matching by Postponing Cartesian Products.

  • Posted on: 11 January 2017
  • By: Adrian Diaz

Lijun Chang, DECRA Fellow at the University of New South Wales talked about how to make subgraph matching more efficient thanks to postponing Cartesian products. They key problem he explained was the extraction of subgraph isomorphic embeddings. The applications of this process are wide enough to cover protein interaction research, social network analysis and even chemical compound investigation. The testing of subgraph isomorphism is an NP-complete type of problem however, his team is focusing on enumerating all subgraph embeddings which, he explains, is even harder.

8th TUC Meeting - Jerven Bolleman (SIB Swiss Institute of Bioinformatics/UniProt consortium). UniProt: challenges of a public SPARQL endpoint.

  • Posted on: 3 January 2017
  • By: Adrian Diaz

Jerven Bolleman, Lead Software Developer at Swiss-Prot Group, explained why are they offering a free SPARQL and RDF endpoint for the world to use and why is it hard to optimize it. The data biologists use tends to be extremely ambiguous and dirty, additionally, scientists are always trying to find new questions to ask, thus why the difficulty regarding the optimization of UniProt, they wouldn’t be offering the right service to their users by optimizing the query patterns. Furthermore, since UniProt is publicly funded, all the data needs to be public.

8th TUC Meeting - Martin Zand University of Rochester Clinical and Translational Science Institute). Graphing Healthcare Networks: Data, Analytics and Use Cases

  • Posted on: 27 December 2016
  • By: Adrian Diaz

Martin Zand, Professor of Medicine and Public Health Sciences at the Rochester enter for Health Informatics, switched the focus of the presentations talking as a user of graph databases. Zand pinpointed the relevance of using graph in healthcare comparing 3 characteristics of healthcare to their counterpart with graphs:

  • Healthcare is delivered by networks.
  • Patients traverse those networks.
  • The topology of the networks influences outcomes.

The talk of Dr. Zand was structured around the presentation of 3 uses cases:

8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics workload.

  • Posted on: 23 December 2016
  • By: Adrian Diaz

Tim Hegeman from TU Delft presented a very interesting talk about Social Network Benchmark analytics. Graphalytics is a benchmark developed by TU Delft for graph analytics, complex and holistic graph computations.

As per today, over 100 graph analytics systems exist, Hegeman explains, but they’re not comprehensive and there's where Graphalytics excels. It consists on algorithms and datasets (workload) that have been selected using a 2-stage process to ensure the representativity of the workload. The stages of the process were:

8th TUC Meeting - Social Network Benchmark: Interactive Workload

  • Posted on: 20 December 2016
  • By: Adrian Diaz

Arnau Prat, Lead Researcher at DAMA-UPC from the Technological University of Catalonia presented a talk on the Interactive Workload of the Social Network Benchmark. One of the key aspects of his talk was the introduction of the SNB Data Generator, tool that generates a Facebook-degree social network distribution (groups, posts, likes…). This synthetic social network follows the principle of homophily, isn’t uniform and allows a fair comparison and reproducibility of benchmark executions while being also scalable by using Apache Hadoop.

9th TUC Meeting, SAP Headquarters in Walldorf Germany, February 9-10 2017

  • Posted on: 25 November 2016
  • By: Damaris Coll

The LDBC consortium is pleased to announce its Ninth Technical User Community (TUC) meeting.

This will be a two-day event at the SAP Headquarters in Walldorf, Germany on Thursday 9th to Friday 10th of Frebuary 2017.

This will be the third TUC meeting after the finalisation of the LDBC FP7 EC funded project. The event will basically set the following aspects:

8th TUC Meeting presentations: Introduction and status update

  • Posted on: 5 October 2016
  • By: Damaris Coll

Last 22nd and 23rd of June took place the 8th edition of the Technical User Community Meeting held in Oracle headquarters at Redwood Shore (California).

During these two days LDBC hosted more than 20 presentations from key members of the industry such as Oracle, Facebook, Neo4j, SAP or Huawei and research regarding the updates on the work within the council, and graphs & RDF applications.  We are going to share all of them as independent blog posts during the following weeks.

Thanks to Oracle for hosting this event!

GRADES workshop & 8th LDBC Technical User Community at the Oracle Conference Center, 22-24 June 2016

  • Posted on: 23 March 2016
  • By: Damaris Coll

GRADES2016 workshop on Graph Data management Experiences & Systems will be held next 24th of June 2016, just before SIGMOD/PODS 2016, in Redwood Shores in the Oracle Conference Center. In the two days preceding GRADES, Wednesday June 22 and Thursday June 23, LDBC will organize a 2-day Technical User Community (TUC) meeting for academics, industry and practitioners in the area of graph data management.

Sixth TUC Meeting: Parallel and incremental materialisation of RDF/DATALOG in RDFox by Boris Motik (University of Oxford)

  • Posted on: 22 July 2015
  • By: Ricard Tapias

During the second day of the sixth TUC meeting, Boris Motik from University of Oxford presented his talk “Parallel and incremental materialisation of RDF/DATALOG in RDFox”. Like the slides of the other TUC meeting talks, this presentation is available on the LDBC Slideshare profile.

Sixth TUC Meeting: MODAClouds Decision Support System for Cloud Service Selection by Smrati Gupta (CA technologies)

  • Posted on: 15 July 2015
  • By: Damaris Coll

During last TUC Meeting in Barcelona we were glad to welcome Smrati Gupta from CA technologies, a leading company that creates systems software that runs in mainframe, distributed computing, virtual machine and cloud computing environments.

Sixth TUC Meeting: E-Commerce and Graph-driven Applications - Experiences and Optimizations while moving to Linked Data by Andreas Both (Unister)

  • Posted on: 10 July 2015
  • By: Ricard Tapias

Andreas Both from Unister presented another great talk on the second day of the 6th LDBC Technical User Community (TUC) meeting held in Barcelona. His talk “E-Commerce and Graph-driven Applications: Experiences and Optimizations while moving to Linked Data” revolved around an e-commerce use case.

Sixth TUC Meeting: LDBC Social Network Benchmark Interactive Workload by Arnau Prat (UPC, Sparsity)

  • Posted on: 26 June 2015
  • By: Arnau Prat

During the 6th TUC Meeting in Barcelona we were glad to welcome Arnau Prat from Universitat Politècnica de Catalunya (Barcelonatech) and Sparsity Technologies with the presentation "LDBC Social Network Benchmark Interactive Workload”.

Elements of instance matching benchmarks: a short overview

  • Posted on: 16 June 2015
  • By: Irini Fundulaki

The number of datasets published in the Web of Data as part of the Linked Data Cloud is constantly increasing. The Linked Data paradigm is based on the unconstrained publication of information by different publishers, and the interlinking of web resources through “same-as” links which specify that two URIs correspond to the same real world object. In the vast number of data sources participating in the Linked Data Cloud, this information is not explicitly stated but is discovered using instance matching techniques and tools.

Sixth TUC Meeting: Recent Updates on IBM System G – GraphBIG and Temporal Data by Yinglong Xia (IBM)

  • Posted on: 11 June 2015
  • By: Ricard Tapias

On the second day of the 6th LDBC TUC Meeting that took place in Barcelona we welcomed Yinglong Xia from IBM Research with his presentation “Recent Updates on IBM System G – GraphBIG and Temporal Data”.

SNB Interactive Part 2 - Modeling Choices

  • Posted on: 26 May 2015
  • By: Orri Erling

Note: this post is a continuation of "SNB Interactive Part 1 - What is SNB Interactive Really About?" post by Orri Erling.

​SNB Interactive is the wild frontier, with very few rules. This is necessary, among other reasons, because there is no standard property graph data model, and because the contestants support a broad mix of programming models, ranging from in-process APIs to declarative query.

LDBC participates in the 36th edition of the ACM SIGMOD/PODS conference

  • Posted on: 25 May 2015
  • By: Damaris Coll

LDBC is presenting two papers at the next edition of the ACM SIGMOD/PODS conference held in Melbourne from May 31st to June 4th, 2015. The annual SCM SIGMOD/PODS conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools and experiences.

Sixth TUC Meeting: SADI: A design-pattern for “native” Linked-Data Semantic Web Services by Mark D. Wilkinson (UPM)

  • Posted on: 20 May 2015
  • By: Damaris Coll

During last TUC Meeting in Barcelona we were glad to welcome Mark D. Wilkinson from Universidad Politécnica de Madrid with his presentation "SADI: A design-pattern for “native” Linked-Data Semantic Web Services".

Check the slides and video to learn more about how SADI uses OWL and RDF and about SHARE the health research environment that answers SPARQL queries with SADI.

SNB Interactive Part 1 - What is SNB Interactive Really About?

  • Posted on: 14 May 2015
  • By: Orri Erling

This post is the first in a series of blogs analyzing the LDBC Social Network Benchmark Interactive workload. This is written from the dual perspective of participating in the benchmark design and of building the OpenLink Virtuoso implementation of same.    

Sixth TUC Meeting: 20 billion triples in production by Jerven Bolleman (Swiss Institute of Bioinformatics)

  • Posted on: 13 May 2015
  • By: Damaris Coll

The second part of presentations during the first day at the TUC Meeting in Barcelona started with the presentation from Jerven Bolleman from the Swiss Institute of Bioinformatics called "20 billion triples in production".

Watch Jerven Bolleman talking about why and how the Uniprot SPARQL endpoint allows working with billion of triples from biological datasets.

Sixth TUC Meeting: Lighthouse - Large-scale graph pattern matching on Giraph by Claudio Martella (VUA)

  • Posted on: 6 May 2015
  • By: Damaris Coll

To end the first day slot of  the morning presentations during last 6th TUC Meeting in Barcelona Claudio Martella from VUA was presenting Lighthouse: Large-scale graph pattern matching on Giraph

Watch Claudio Martella to learn more about Lighthouse and how it uses Giraph and Cypher.

Sixth TUC Meeting: Titan DB on LDBC SNB Interactive by Tomer Sagi (HP)

  • Posted on: 29 April 2015
  • By: Damaris Coll

For the third presentation this last 6th TUC Meeting held in Barcelona we were pleased to welcome Tomer Sagi from HP.

Watch his presentation called "HP Labs: Titan DB on LDBC Interactive" where Tomer introduced the history of the research performed at HP along with his latest work fields.

 

Sixth TUC Meeting: SPIMBENCH for the Semantic Publishing Domain by Nina Saveta (Forth)

  • Posted on: 22 April 2015
  • By: Damaris Coll

The second presentation during last LDBC's  6th TUC Meeting that took place in Barcelona was called SPIMBENCH: A Scalable, Schema-Aware, Instance Matching Benchmark for the Semantic Publishing Domain and presented by Tzanina Saveta from FORTH

Watch Tzanina Saveta presenting the benchmark for the Semantic Publishing domain.

Why do we need an LDBC SNB-specific workload driver?

  • Posted on: 21 April 2015
  • By: Alex Averbuch

In a previous 3-part blog series we touched upon the difficulties of executing the LDBC SNB Interactive (SNB) workload, while achieving good performance and scalability. What we didn't discuss is why these difficulties were unique to SNB, and what aspects of the way we perform workload execution are scientific contributions - novel solutions to previously unsolved problems. This post will highlight the differences between SNB and more traditional database benchmark workloads. Additionally, it will motivate why we chose to develop a new wo

Tags: 

Sixth TUC Meeting: Semantic Publishing Benchmark v2.0 by Venelin Kotsev (Ontotext)

  • Posted on: 15 April 2015
  • By: Damaris Coll

Last 19th and 20th of March took place the sixth edition of the Technical User Community Meeting held in Barcelona. During these two days LDBC hosted more than 15 presentations from key members of the industry and research regarding graphs and RDF that we are going to share as independent blog posts during the following weeks.

Watch the first presentation with Venelin Kotsev presenting the details of the evolution of the SPB.

OWL-empowered SPARQL Query Optimization

  • Posted on: 18 February 2015
  • By: Irini Fundulaki

The Linked Data paradigm has become the prominent enabler for sharing huge volumes of data using Semantic Web technologies, and has created novel challenges for non-relational data management systems, such as RDF and graph engines. Efficient data access through queries is perhaps the most important data management task, and is enabled through query optimization techniques, which amount to the discovery of optimal or close to optimal execution plans for a given query.

Person activity subgraph features in LDBC DATAGEN

  • Posted on: 3 February 2015
  • By: Arnau Prat

When talking about DATAGEN and other graph generators with social network characteristics, our attention is typically borrowed by the friendship subgraph and/or its structure. However, a social graph is more than a bunch of people being connected by friendship relations, but has a lot more of other things is worth to look at. With a quick view to commercial social networks like Facebook, Twitter or Google+, one can easily identify a lot of other elements such as text images or even video assets. More importantly, all these elements form other subgraphs within the social network!

Tags: 

5th LDBC TUC meeting

  • Posted on: 27 January 2015
  • By: Ioan Toma

The 5th LDBC Technical User Community (TUC) meeting took place in Athens on 14.11.2014 being well attended by both Graph and RDF databases industry and academia. In the morning session, members of the LDBC project gave an update on the status of the project and its benchmarks: