8th TUC Meeting - Jerven Bolleman (SIB Swiss Institute of Bioinformatics/UniProt consortium). UniProt: challenges of a public SPARQL endpoint.
Jerven Bolleman, Lead Software Developer at Swiss-Prot Group, explained why are they offering a free SPARQL and RDF endpoint for the world to use and why is it hard to optimize it. The data biologists use tends to be extremely ambiguous and dirty, additionally, scientists are always trying to find new questions to ask, thus why the difficulty regarding the optimization of UniProt, they wouldn’t be offering the right service to their users by optimizing the query patterns. Furthermore, since UniProt is publicly funded, all the data needs to be public.
Bolleman also explained the challenges they were facing with the uptime regarding the pile-up of long running queries, problems with the fragility of HTTP and he pinpointed, one that one of the main issues was the fact that semantic web researchers were conducting queries to the whole database. After this Jerven talked about the increasing number of users writing their own queries on UniProt’s SPARQL page (around 2000 as per today).
Finally, he commented how they offered new visualisations to their users by using SPARQL queries in a clever way. The front-end team requested JSON files to generate the visualisations but, thanks to the use of SPARQL, they could offer the same features and it didn’t require any new development since they were already implemented.
Start planning your assistance to the upcoming 9th TUC Meeting at SAP's HQ in Walldorf, Germany the 9-10th of February!
As always, you'll find Bolleman's presentation at the bottom of the post