<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Xavier Llorà &#187; Data-Intensive Computing</title>
	<atom:link href="http://www.xavierllora.net/tag/data-intensive-computing/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.xavierllora.net</link>
	<description>A notebook on data-intensive computing, genetics-based machine learning &#38; more.</description>
	<lastBuildDate>Sun, 08 Jan 2012 19:39:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Parallel and Distributed Computational Intelligence book is out for pre-order</title>
		<link>http://www.xavierllora.net/2010/09/13/parallel-and-distributed-computational-intelligence-book-is-out/</link>
		<comments>http://www.xavierllora.net/2010/09/13/parallel-and-distributed-computational-intelligence-book-is-out/#comments</comments>
		<pubDate>Tue, 14 Sep 2010 04:17:35 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Publications]]></category>
		<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[distributed computing]]></category>
		<category><![CDATA[Estimation of Distribution Algorithms]]></category>
		<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[parallel programming]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=710</guid>
		<description><![CDATA[&#8220;Parallel and Distributed Computational Intelligence&#8221; edited by Francisco Fernández de Vega &#038; Erick Cantú-Paz and published by Springer is out for pre-order. The first chapter &#8220;When Huge is Routine: Scaling Genetic Algorithms and Estimation of Distribution Algorithms via Data-Intensive Computing&#8221; of the book was written together with coauthors Abhishek Verma, Roy Campbell, and David E. [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2008/05/22/zookeeper-and-orchestrating-distributed-applications/' rel='bookmark' title='ZooKeeper and distributed applications'>ZooKeeper and distributed applications</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
<li><a href='http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/' rel='bookmark' title='Scaling Genetic Algorithms using MapReduce'>Scaling Genetic Algorithms using MapReduce</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.springer.com/engineering/book/978-3-642-10674-3">&#8220;Parallel and Distributed Computational Intelligence&#8221;</a> edited by Francisco Fernández de Vega &#038; Erick Cantú-Paz and published by Springer is out for pre-order. The first chapter <em>&#8220;When Huge is Routine: Scaling Genetic Algorithms and Estimation of Distribution Algorithms via Data-Intensive Computing&#8221;</em> of the book was written together with coauthors Abhishek Verma, Roy Campbell, and David E. Goldberg describing how data-intensive computing can help push the size of problems that GAs and EDAs can address. You may find the abstact of the book below.</p>
<p><strong>Abstract:</strong></p>
<blockquote><p>The growing success of biologically inspired algorithms in solving large and complex problems has spawned many interesting areas of research. Over the years, one of the mainstays in bio-inspired research has been the exploitation of parallel and distributed environments to speedup computations and to enrich the algorithms. From the early days of research on bio-inspired algorithms, their inherently parallel nature was recognized and different parallelization approaches have been explored. Parallel algorithms promise reductions in execution time and open the door to solve increasingly larger problems. But parallel platforms also inspire new bio-inspired parallel algorithms that, while similar to their sequential counterparts, explore search spaces differently and offer improvements in solution quality.</p>
<p>The objective in editing this book was to assemble a sample of the best work in parallel and distributed biologically inspired algorithms. The editors invited researchers in different domains to submit their work. They aimed to include diverse topics to appeal to a wide audience. Some of the chapters summarize work that has been ongoing for several years, while others describe more recent exploratory work. Collectively, these works offer a global snapshot of the most recent efforts of bioinspired algorithms’ researchers aiming at profiting from parallel and distributed computer architectures—including GPUs, Clusters, Grids, volunteer computing and p2p networks as well as multi-core processors. This volume will be of value to a wide set of readers, including, but not limited to specialists in Bioinspired Algorithms, Parallel and Distributed Computing, as well as computer science students trying to figure out new paths towards the future of computational intelligence.</p></blockquote>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2008/05/22/zookeeper-and-orchestrating-distributed-applications/' rel='bookmark' title='ZooKeeper and distributed applications'>ZooKeeper and distributed applications</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
<li><a href='http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/' rel='bookmark' title='Scaling Genetic Algorithms using MapReduce'>Scaling Genetic Algorithms using MapReduce</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/09/13/parallel-and-distributed-computational-intelligence-book-is-out/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Meandre 2.0 Alpha Preview = Scala + MongoDB</title>
		<link>http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/</link>
		<comments>http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 19:45:00 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Meandre]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Crochet]]></category>
		<category><![CDATA[Derby]]></category>
		<category><![CDATA[JENA]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[mongodb]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[Snare]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=697</guid>
		<description><![CDATA[A lot of water under the bridge has gone by since the first release of Meandre 1.4.X series. In January I went back to the drawing board and start sketching what was going to be 1.5.X series. The slide deck embedded above is a extended list of the thoughts during the process. As usual, I [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/12/01/meandre-is-going-scala/' rel='bookmark' title='Meandre is going Scala'>Meandre is going Scala</a></li>
<li><a href='http://www.xavierllora.net/2010/01/21/fast-rest-api-prototyping-with-crochet-and-scala/' rel='bookmark' title='Fast REST API prototyping with Crochet and Scala'>Fast REST API prototyping with Crochet and Scala</a></li>
<li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><center><br />
<iframe src="http://www.slideshare.net/slideshow/embed_code/4765637" width="425&type=s" height="356" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe><br/><br/><br />
</center></p>
<p>A lot of water under the bridge has gone by since the first release of <a href="http://seasr.org/meandre/download/">Meandre 1.4.X series</a>. In January I went back to the drawing board and start sketching what was going to be 1.5.X series. The slide deck embedded above is a extended list of the thoughts during the process. As usual, I started collecting feedback from people using 1.4.X in production, things that worked, things that needed improvement, things that were just plain over complicated. The hot recurrent topics that people using 1.4.X could be mainly summarized as:</p>
<ul>
<li>Complex execution concurrency model based on traditional semaphores written in Java (mostly my maintenance nightmare when changes need to be introduced)</li>
<li>Server performance bounded by <a href="http://jena.sourceforge.net/">JENA</a>&#8216;s persistent model implementation</li>
<li>State caching on individual servers to boost performance increases complexity of single-image cluster deployments</li>
<li>Could-deployable infrastructure, but not cloud-friendly infrastructure</li>
</ul>
<p>As I mentioned, these elements where the main ingredients to target for 1.5.X series. However as the redesign moved forward, the new version represented a radical disruption from 1.4.X series and eventually turned up to become the 2.0 Alpha version described here. The main changes that forced this transition are:</p>
<ul>
<li>Cloud-friendly infrastructure required rethinking of the core functionalities</li>
<li>Drastic redesign of the back-end state storage</li>
<li>Revisited flow execution engine to support flow execution</li>
<li>Changes on the API that render returned JSON documents incompatible with 1.4.X</li>
</ul>
<p>Meandre 2.0 (currently already available in the the <a href="http://dev-tools.seasr.org/fisheye/browse/Meandre-Infrastructure">SVN trunk</a>) has been rewritten from scratch using <a href="http://www.scala-lang.org/">Scala</a>. That decision was motivated to benefit from the Actor model provided by <a href="http://www.scala-lang.org/">Scala</a> (modeled after <a href="http://www.erlang.org/">Erlang</a>&#8216;s actors). Such model greatly simplify the mechanics of the infrastructure, but it also powered the basis of Snowfield (the effort to create a scalable distributed flow execution engine for Meandre flows). Also, the <a href="http://www.scala-lang.org/">Scala</a> language expressiveness has greatly reduced the code based size (2.0 code base is roughly 1/3 of the size of 1.4.X series) greatly simplifying the maintenance activities the infrastructure will require as we move forward.</p>
<p>The second big change that pushed the 2.0 Alpha trigger was the redesign of the back end state storage. 1.4.X series heavily relied on the relational storage for persistent RDF models provided by JENA. For performance reasons, <a href="http://jena.sourceforge.net/">JENA</a> caches the model in memory and mostly assumes ownership of the model. Hence, if you want to provide a single-image Meandre cluster you need to inject into <a href="http://jena.sourceforge.net/">JENA</a> cache coherence mechanics, greatly increasing the complexity. Also, the relational implementation relies on the mapping model into a table and triple into a row (this is a bit of a simplification). That implies that large number of SQL statements need to be generated to update models, heavily taxing the relational storage when changes on user repository data needs to be introduced.</p>
<p>An ideal cloud-friendly Meandre infrastructure should not maintain state (neither voluntarily, neither as result of <a href="http://jena.sourceforge.net/">JENA</a> back end). Thus, a fast and scalable back end storage could allow infrastructure servers to maintain no state and be able to provide the appearance of a single image cluster. After testing different alternatives, their community support, and development roadmap, the only option left was <a href="http://www.mongodb.org/">MongoDB</a>. Its setup simplicity for small installations and its ability to easily scale to large installations (including cloud-deployed ones) made <a href="http://www.mongodb.org/">MongoDB</a> the candidate to maintain state for Meandre 2.0. This was quite a departure from 1.4.x series, where you had the choice to store state via <a href="http://jena.sourceforge.net/">JENA</a> on an embedded <a href="http://db.apache.org/derby/">Derby</a> or an external <a href="http://www.mysql.com/">MySQL</a> server.</p>
<p>A final note on the building blocks that made possible 2.0 series. Two other side projects where started to support the development of what will become Meandre 2.0.X series:</p>
<ol>
<li><a href="http://github.com/xllora/Crochet">Crochet</a>: <a href="http://github.com/xllora/Crochet">Crochet</a> targets to help quickly prototype REST APIs relying on the flexibility of the Scala language. The initial ideas for Crochet were inspired after reading Gabriele Renzi post on creating a picoframework with Scala (see <a href="http://www.riffraff.info/2009/4/11/step-a-scala-web-picoframework">http://www.riffraff.info/2009/4/11/step-a-scala-web-picoframework</a>) and the need for quickly prototyping APIs for pilot projects. Crochet also provides mechanisms to hide repetitive tasks involved with default responses and authentication/authorization piggybacking on the mechanics provided by application servers.</li>
<li><a href="http://github.com/xllora/Snare">Snare</a>: <a href="http://github.com/xllora/Snare">Snare</a> is a coordination layer for distributed applications written in Scala and relies and <a href="http://www.mongodb.org/">MongoDB</a> to implement its communication layer. <a href="http://github.com/xllora/Snare">Snare</a> implements a basic heartbeat system and a simple notification mechanism (peer-to-peer and broadcast communication). Snare relies on <a href="http://www.mongodb.org/">MongoDB</a> to track heartbeat and notification mailboxes.</li>
</ol>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/12/01/meandre-is-going-scala/' rel='bookmark' title='Meandre is going Scala'>Meandre is going Scala</a></li>
<li><a href='http://www.xavierllora.net/2010/01/21/fast-rest-api-prototyping-with-crochet-and-scala/' rel='bookmark' title='Fast REST API prototyping with Crochet and Scala'>Fast REST API prototyping with Crochet and Scala</a></li>
<li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling eCGA Model Building via Data-Intensive Computing</title>
		<link>http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/</link>
		<comments>http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/#comments</comments>
		<pubDate>Thu, 08 Apr 2010 16:17:39 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Estimation of Distribution Algorithms]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[eCGA]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[map-reduce]]></category>
		<category><![CDATA[mongodb]]></category>
		<category><![CDATA[pro]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=664</guid>
		<description><![CDATA[I just uploaded the technical report of the paper we put together for CEC 2010 on how we can scale up eCGA using a MapReduce approach. The paper, besides exploring the Hadoop implementation, it also presents some very compelling results obtained with MongoDB (a document based store able to perform parallel MapReduce tasks via sharding). [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/' rel='bookmark' title='Scaling Genetic Algorithms using MapReduce'>Scaling Genetic Algorithms using MapReduce</a></li>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
<li><a href='http://www.xavierllora.net/2008/03/26/data-intensive-scalable-computing-randy-bryant/' rel='bookmark' title='[BDCSG2008] Data-Intensive Scalable Computing (Randy Bryant)'>[BDCSG2008] Data-Intensive Scalable Computing (Randy Bryant)</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I just uploaded the technical report of the paper we put together for <a href="http://www.wcci2010.org/">CEC 2010</a> on how we can scale up eCGA using a MapReduce approach. The paper, besides exploring the <a href="http://hadoop.apache.org/">Hadoop</a> implementation, it also presents some very compelling results obtained with <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB</a> (a document based store able to perform parallel MapReduce tasks via sharding). The paper is available as <a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2010001.pdf">PDF</a> and <a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2010001.ps.Z">PS</a>.</p>
<p><strong>Abstract:</strong><br />
This paper shows how the extended compact genetic algorithm can be scaled using data-intensive computing techniques such as MapReduce. Two different frameworks (Hadoop and MongoDB) are used to deploy MapReduce implementations of the compact and extended com- pact genetic algorithms. Results show that both are good choices to deal with large-scale problems as they can scale with the number of commodity machines, as opposed to previous ef- forts with other techniques that either required specialized high-performance hardware or shared memory environments.</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/' rel='bookmark' title='Scaling Genetic Algorithms using MapReduce'>Scaling Genetic Algorithms using MapReduce</a></li>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
<li><a href='http://www.xavierllora.net/2008/03/26/data-intensive-scalable-computing-randy-bryant/' rel='bookmark' title='[BDCSG2008] Data-Intensive Scalable Computing (Randy Bryant)'>[BDCSG2008] Data-Intensive Scalable Computing (Randy Bryant)</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Soaring the Clouds with Meandre</title>
		<link>http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/</link>
		<comments>http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/#comments</comments>
		<pubDate>Mon, 15 Mar 2010 22:55:11 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Notes]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[ZigZag]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=659</guid>
		<description><![CDATA[You may find the slide deck and the abstract for the presentation we delivered today at the &#8220;Data-Intensive Research: how should we improve our ability to use data&#8221; workshop in Edinburgh. Abstract This talk will focus a highly scalable data intensive infrastructure being developed at the National Center for Supercomputing Application (NCSA) at the University [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
<li><a href='http://www.xavierllora.net/2008/04/18/meandre-semantic-driven-data-intensive-flow-engine/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flow Engine'>Meandre: Semantic-Driven Data-Intensive Flow Engine</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>You may find the slide deck and the abstract for the presentation we delivered today at the <a href="http://wikis.nesc.ac.uk/escienvoy/Data-Intensive_Research:_how_should_we_improve_our_ability_to_use_data">&#8220;Data-Intensive Research: how should we improve our ability to use data&#8221;</a> workshop in Edinburgh.</p>
<p><center><iframe src="http://www.slideshare.net/slideshow/embed_code/3440242" width="425&type=s" height="356" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe><br/><br/></center></p>
<p><strong>Abstract</strong></p>
<p>This talk will focus a highly scalable data intensive infrastructure being developed at the National Center for Supercomputing Application (NCSA) at the University of Illinois and will introduce current research efforts to tackle the challenges presented by big-data. Research efforts include exploring potential ways of integration between cloud computing concepts—such as Hadoop or Meandre—and traditional HPC technologies and assets. These architecture models contrast significantly, but can be leveraged by building cloud conduits that connect these resources to provide even greater flexibility and scalability on demand. Orchestrating the physical computational environment requires innovative and sophisticated software infrastructure that can transparently take advantage of the functional features and to negotiate the constraints imposed by this diversity of computational resources. Research conducted during the development of the Meandre infrastructure has lead to the production of an agile conductor able to leverage the particular advantages in the physical diversity. It can also be implemented as services and/or in the context of another application benefitting from it reusability, flexibility, and high-scalability. Some example applications and an introduction to the data intensive infrastructure architecture will be presented to provide an overview of the diverse scope of Meandre usages. Finally, a case will be presented showing how software developers and system designers can easily transition to these new paradigms to address the primary data-deluge challenges and to soar to new heights with extreme application scalability using cloud computing concepts.</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
<li><a href='http://www.xavierllora.net/2008/04/18/meandre-semantic-driven-data-intensive-flow-engine/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flow Engine'>Meandre: Semantic-Driven Data-Intensive Flow Engine</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Temporary storage for Meandre&#8217;s distributed flow execution</title>
		<link>http://www.xavierllora.net/2009/09/29/temporary-storage-for-meandres-distribute-flow-execution/</link>
		<comments>http://www.xavierllora.net/2009/09/29/temporary-storage-for-meandres-distribute-flow-execution/#comments</comments>
		<pubDate>Tue, 29 Sep 2009 15:14:28 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Notes]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[data-intensive flows]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[tokyo cabinet]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=615</guid>
		<description><![CDATA[Designing the distributed execution of a generic Meandre flow involves several moving pieces. One of those is the temporary storage required by the computing nodes (think of it as one node as one isolated component of a flow) to keep up with the data generated by a component, and also be able to replicate such [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/08/13/easy-reliable-and-flexible-storage-for-python/' rel='bookmark' title='Easy, reliable, and flexible storage for Python'>Easy, reliable, and flexible storage for Python</a></li>
<li><a href='http://www.xavierllora.net/2008/07/01/efficient-storage-for-python/' rel='bookmark' title='Efficient storage for Python'>Efficient storage for Python</a></li>
<li><a href='http://www.xavierllora.net/2008/06/05/the-next-generation-of-data-bases/' rel='bookmark' title='The next generation of data bases'>The next generation of data bases</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Designing the distributed execution of a generic Meandre flow involves several moving pieces. One of those is the temporary storage required by the computing nodes (think of it as one node as one isolated component of a flow) to keep up with the data generated by a component, and also be able to replicate such storage to the node containing the consumer to be fed. Such storage, local to each node, must guarantee at least three basic properties.</p>
<ul>
<li>Transaction ready</li>
<li>Light weight implementation</li>
<li>Efficient write and read to minimize the contention on ports</li>
</ul>
<p>Also, it is important to keep in mind that in a distributed execution scenario, each node requires to have its one separated and standalone storage system. Thus, it is also important to minimize the overhead of installation and maintenance of such storage subsystem. There are several alternatives available ranging from traditional relational data base systems to home-brewed solutions. Relational data base systems provide a distributed, reliable, stable, and well tested environment, but they may tend to require a quite involved installation and maintenance. Also, tuning those systems to optimize performance may required quite an involved monitoring and tweaking. On the other hand, home-brewed solutions can be optimized for performance by dropping non required functionality and focussing on writing and reading performance. However, such solutions tend to be bug prone and tend to become time consuming, not to mention that proving transaction correctness can be quite involved.</p>
<p>Fortunately there is a middle ground where efficient and stable transaction aware solutions are available. They may not provide SQL interfaces, but they still provide transaction boundaries. Also, since they are oriented to maximize performance, they can provide better throughput and operation latency than having to traverse the SQL stack. Examples of such storage systems can be found under the areas of key-value stores and column stores. Several options were considered while writing these line, but key-value stores were the ones that better matches the three requirements described above. Several options were informally tested, including solutions like HDF and Berkely DB, however the best performing by far under similar stress test conditions as the sketched temporary storage subsystem was <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a>. I already <a href="/2008/06/05/the-next-generation-of-data-bases/">introduced and <a href="/2009/08/13/easy-reliable-and-flexible-storage-for-python/">tested</a> <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> more than a year ago, but this time I was going to give it a stress test to basically convince myself that that was what I wanted to use for as temporary storage of the distributed flow execution.</p>
<h2>The experiment</h2>
<p>Tokyo cabinet is a collection of storage utilities including, among other facilities, key-value stores implemented as hash files or B-trees and flexible column stores. To illustrate the performance and throughput you can achieve. To implement multiple queues on a single casket (<a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> file containing the data store) B-trees with duplicated keys can help achieving such goal. The duplicated keys are the queue names, and the values are the <a href="http://en.wikipedia.org/wiki/Universally_Unique_Identifier">UUID</a>s of the objects being store. Objects are also stored in the same B-tree by using the <a href="http://en.wikipedia.org/wiki/Universally_Unique_Identifier">UIUD</a> as a key and the value become the payload to store (usually an array of bytes). </p>
<p>Previously, I have been heavily using Python bindings to test <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a>, but this time I went down the Java route (since the Meandre infrastructure is written on Java). The Java bindings are basically build around JNI and statically link to the C version of <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> library, giving away the best of both world. To measure how fast can I write data out of a port into the local storage in a transactional mode, I used the following piece of code.</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main <span style="color: #009900;">&#40;</span> <span style="color: #003399;">String</span> args <span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000066; font-weight: bold;">int</span> MAX <span style="color: #339933;">=</span> <span style="color: #cc66cc;">10000000</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">int</span> inc <span style="color: #339933;">=</span> <span style="color: #cc66cc;">10</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">int</span> cnt <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">float</span> fa <span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">float</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">8</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">int</span> reps <span style="color: #339933;">=</span> <span style="color: #cc66cc;">10</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span> <span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">1</span> <span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;=</span>MAX <span style="color: #339933;">;</span> i<span style="color: #339933;">*=</span>inc  <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #666666; font-style: italic;">//System.out.println(&quot;Size: &quot;+i);</span>
			<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span> <span style="color: #000066; font-weight: bold;">int</span> j<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span> <span style="color: #339933;">;</span> j<span style="color: #339933;">&lt;</span>reps <span style="color: #339933;">;</span> j<span style="color: #339933;">++</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>	
				<span style="color: #666666; font-style: italic;">//System.out.println(&quot;\tRepetition: &quot;+j);</span>
&nbsp;
				<span style="color: #666666; font-style: italic;">// open the database</span>
				BDB bdb <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> BDB<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
				<span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span>bdb.<span style="color: #006633;">open</span><span style="color: #009900;">&#40;</span>TEST_CASKET_TCB, BDB.<span style="color: #006633;">OWRITER</span> <span style="color: #339933;">|</span> BDB.<span style="color: #006633;">OCREAT</span> <span style="color: #339933;">|</span> BDB.<span style="color: #006633;">OTSYNC</span> <span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
					<span style="color: #000066; font-weight: bold;">int</span> ecode <span style="color: #339933;">=</span> bdb.<span style="color: #006633;">ecode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
					fail<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;open error: &quot;</span> <span style="color: #339933;">+</span> bdb.<span style="color: #006633;">errmsg</span><span style="color: #009900;">&#40;</span>ecode<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				<span style="color: #009900;">&#125;</span>
&nbsp;
				<span style="color: #666666; font-style: italic;">// Add a bunch of duplicates</span>
				<span style="color: #000066; font-weight: bold;">long</span> start <span style="color: #339933;">=</span> <span style="color: #003399;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				bdb.<span style="color: #006633;">tranbegin</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span> <span style="color: #000066; font-weight: bold;">int</span> k<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> k<span style="color: #339933;">&lt;</span>i<span style="color: #339933;">;</span> k<span style="color: #339933;">++</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
					<span style="color: #003399;">String</span> uuid <span style="color: #339933;">=</span> UUID.<span style="color: #006633;">randomUUID</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">toString</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
					bdb.<span style="color: #006633;">putdup</span><span style="color: #009900;">&#40;</span>QUEUE_KEY, uuid<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
					bdb.<span style="color: #006633;">putdup</span><span style="color: #009900;">&#40;</span>uuid.<span style="color: #006633;">getBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>, uuid.<span style="color: #006633;">getBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>	
				<span style="color: #009900;">&#125;</span>
				bdb.<span style="color: #006633;">trancommit</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				fa<span style="color: #009900;">&#91;</span>cnt<span style="color: #009900;">&#93;</span> <span style="color: #339933;">+=</span> <span style="color: #003399;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">-</span>start<span style="color: #339933;">;</span>
&nbsp;
				<span style="color: #666666; font-style: italic;">// Clean up</span>
				bdb.<span style="color: #006633;">close</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				<span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">File</span><span style="color: #009900;">&#40;</span>TEST_CASKET_TCB<span style="color: #009900;">&#41;</span>.<span style="color: #006633;">delete</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			fa<span style="color: #009900;">&#91;</span>cnt<span style="color: #009900;">&#93;</span> <span style="color: #339933;">/=</span> reps<span style="color: #339933;">;</span>
			<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;&quot;</span><span style="color: #339933;">+</span>i<span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\t</span>&quot;</span><span style="color: #339933;">+</span>fa<span style="color: #009900;">&#91;</span>cnt<span style="color: #009900;">&#93;</span><span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\t</span>&quot;</span><span style="color: #339933;">+</span><span style="color: #009900;">&#40;</span>fa<span style="color: #009900;">&#91;</span>cnt<span style="color: #009900;">&#93;</span><span style="color: #339933;">/</span>i<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			cnt<span style="color: #339933;">++;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<p>The idea is very simple. Just go and star storing 1, 10, 100, 1000, 10000, 1000000, and 10000000 pieces of data at once in a transaction. Measure the time. For each data number repeat the operation 10 times and average the time trying to palliate the fact that the experiment was run on a laptop running all sorts of other concurrent applications. Plot the results to illustrate:</p>
<ol>
<li>time required to insert one piece of data as a function of the number of data involve in the transaction</li>
<li>number of pieces of data wrote per second as a function of the number of data involve in the transaction</li>
</ol>
<p>The idea is to expose the behavior of <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> as more data is involved in a transaction to check if degradation happens as the volume increase. This is an important issue, since data intensive flows can generate large volumes of data per firing event. </p>
<h2>The results</h2>
<p>Results are displayed on the figures below.</p>
<p><a href="http://www.xavierllora.net/wp-content/uploads/2009/09/tc_time.png"><img src="http://www.xavierllora.net/wp-content/uploads/2009/09/tc_time-400x400.png" alt="Time per data unit as a function of number of data involve in a transaction" title="Time per data unit as a function of number of data involve in a transaction" width="300" height="300" /></a><a href="http://www.xavierllora.net/wp-content/uploads/2009/09/tc_throughput.png"><img src="http://www.xavierllora.net/wp-content/uploads/2009/09/tc_throughput-400x400.png" alt="Throughput as a function of number of data in a transaction" title="Throughput as a function of number of data in a transaction" width="300" height="300" /></a></p>
<p>The first important element to highlight is that the time to insert one data element does not degrade as the volume increase. Actually, it is quite interesting that <a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> feels more comfortable as the volume per transaction grows. The throughput results are also interesting, since it shows that it is able to sustain transfers of around 40K data units per second, and that the only bottleneck is the disk cache management and bandwidth to the disk itself&#8212;which gets saturated after pushing more than 10K pieces of data.</p>
<h2>The lessons learned</h2>
<p><a href="http://1978th.net/tokyocabinet/">Tokyo Cabinet</a> is a excellent candidate to support the temporary transactional storage required in a distributed execution of a Meandre flow. Other alternatives like <a href="http://www.mysql.com/">MySQL</a>, embedded <a href="http://db.apache.org/derby/">Apache Derby</a>, the <a href="http://www.oracle.com/database/berkeley-db/je/index.html">Java edition of Berkeley DB</a>, <a href="http://www.zentus.com/sqlitejdbc/">SQLite JDBC</a> could not get even get close to such performance falling at least one order of magnitude behind.</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/08/13/easy-reliable-and-flexible-storage-for-python/' rel='bookmark' title='Easy, reliable, and flexible storage for Python'>Easy, reliable, and flexible storage for Python</a></li>
<li><a href='http://www.xavierllora.net/2008/07/01/efficient-storage-for-python/' rel='bookmark' title='Efficient storage for Python'>Efficient storage for Python</a></li>
<li><a href='http://www.xavierllora.net/2008/06/05/the-next-generation-of-data-bases/' rel='bookmark' title='The next generation of data bases'>The next generation of data bases</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/09/29/temporary-storage-for-meandres-distribute-flow-execution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Liquid: RDF endpoint for FluidDB</title>
		<link>http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/</link>
		<comments>http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 20:45:19 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Notes]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[FluidDB]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[sparql]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=593</guid>
		<description><![CDATA[A while ago I wrote some thoughts about how to map RDF to and from FluidDB. There I explored how you could map RDF onto FluidDB, and how to get it back. That got me thinking about how to get a simple endpoint you could query for RDF. Imagine that you could pull FluidDB data [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/' rel='bookmark' title='Liquid: RDF meandering in FluidDB'>Liquid: RDF meandering in FluidDB</a></li>
<li><a href='http://www.xavierllora.net/2009/09/29/temporary-storage-for-meandres-distribute-flow-execution/' rel='bookmark' title='Temporary storage for Meandre&#8217;s distributed flow execution'>Temporary storage for Meandre&#8217;s distributed flow execution</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>A while ago I wrote some thoughts about how to map RDF to and from <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. There I explored how you could map RDF onto FluidDB, and how to get it back. That got me thinking about how to get a simple endpoint you could query for RDF. Imagine that you could pull <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> data in RDF, then I could just get all the flexibility of <a href="http://www.w3.org/TR/rdf-sparql-query/">SPARQL</a> for free. With this idea in my mind I just went and grabbed <a href="http://seasr.org/meandre">Meandre</a>, the <a href="http://github.com/rossjones/JFluidDB">JFLuidDB</a> library started by <a href="http://github.com/rossjones">Ross Jones</a>, and build a few components.</p>
<p>The main goal was to be able to get an object, list of the tags, and express the result in RDF. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> helps the mapping since objects are uniquely identified by URIs. For instance, the unique object <code>5ff74371-455b-4299-83f9-ba13ae898ad1</code> (FluidDB relies on UUID version four with the form <code>xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx</code>) is uniquely identified by <code>http://sandbox.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1</code> (or a url of the form <code>http://sandbox.fluidinfo.com/objects/xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx</code>), in case you are using the sandbox or <code>http://fluiddb.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1</code> if you are using the main instance. Same story for tags. The tag <code>fluiddb/about</code> can be uniquely identified by the URI <code>http://sandbox.fluidinfo.com/tags/fluiddb/about</code>, or <code>http://fluiddb.fluidinfo.com/tags/fluiddb/about</code>.</p>
<h2>A simple RDF description for and object<br />
<h2>
<p>Once you get the object back the basic translated RDF version for object <code>a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db</code> should look like as the listing below in TURTLE notation.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db<span style="color: #000000; font-weight: bold;">&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#type<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/objects<span style="color: #000000; font-weight: bold;">/&gt;</span></span> , <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#Bag<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#_1<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/tags/fluiddb/about<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#_2<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/tags/fluiddb/tags/path<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#_3<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/tags/fluiddb/tags/description<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//purl.org/dc/elements/1.1/description<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              &quot;Object for the attribute fluiddb/default/tags/permission/update/policy&quot;^^<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/2001/XMLSchema#string<span style="color: #000000; font-weight: bold;">&gt;</span></span> .</pre></div></div>

<p>I will break the above example into small chunks and explain the above example into the three main pieces involved (the id, the about, and the tags). The basic construct is simple. First a triple to mark the object as a <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> object.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db<span style="color: #000000; font-weight: bold;">&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#type<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/objects<span style="color: #000000; font-weight: bold;">/&gt;</span></span>   
.</pre></div></div>

<p>Then if the object has an <code>about</code> associated on creation, another triple gets generated and added, as shown below. To be consistent, I suggest reusing DC description since that is what the <code>about</code> for an object tend to indicate.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db<span style="color: #000000; font-weight: bold;">&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//purl.org/dc/elements/1.1/description<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              &quot;Object for the attribute fluiddb/default/tags/permission/update/policy&quot;^^<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/2001/XMLSchema#string<span style="color: #000000; font-weight: bold;">&gt;</span></span> 
.</pre></div></div>

<p>Finally, if there are tags associated to the object, a bag gets created, and all the URI describing the tags get pushed into the bag as shown below.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/objects/a10ab0f3-ef56-4fc0-a8fa-4d452d8ab1db<span style="color: #000000; font-weight: bold;">&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#type<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#Bag<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#_1<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/tags/fluiddb/about<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#_2<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/tags/fluiddb/tags/path<span style="color: #000000; font-weight: bold;">&gt;</span></span> ;
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//www.w3.org/1999/02/22-rdf-syntax-ns#_3<span style="color: #000000; font-weight: bold;">&gt;</span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;http:</span>//sandbox.fluidinfo.com/tags/fluiddb/tags/description<span style="color: #000000; font-weight: bold;">&gt;</span></span>
.</pre></div></div>

<h2>Creating and RDF endpoint</h2>
<p>Armed with the previous, the thing should be easy. Just allow querying for objects, then collect the object information, and finally generate the final RDF. Using <a href="http://seasr.org/meandre">Meandre</a> and <a href="http://github.com/rossjones/JFluidDB">JFLuidDB</a> I wrote a few components that allow the simple creation of such an endpoint as illustrated by the picture below.</p>
<p><a href="http://www.xavierllora.net/wp-content/uploads/2009/09/meandre-fluiddb-rdf.png"><img src="http://www.xavierllora.net/wp-content/uploads/2009/09/meandre-fluiddb-rdf.png" alt="Meandre FluidDB RDF endpoint" title="Meandre FluidDB RDF endpoint" width="500" height="204" class="aligncenter size-full wp-image-602" /></a></p>
<p>The basic mechanism is simple. Just push the query into the <em>Query for objects</em> component. This component will stream each of the <code>uuid</code> of the matched objects to <em>Read object</em> which pulls the object information. Then the object is passed to <em>Object to RDF model</em> that basically generates the RDF snipped shown in the example shown above for each of the objects pushed. Finally all the RDF fragments are reduced together by component <em>Wrapped models reducer</em>. Then the resulting RDF model just gets serialize into text using the Turtle notation. Finally the serialized text is printed to the console. The equivalent code could be express as a <a href="http://seasr.org/meandre/documentation/for-developers/zigzag/">ZigZag</a> script as:</p>
<pre>
#
# Imports eliminated for clarity
#

#
# Create the component aliases
#
alias <meandre://fluidinfo.com/fluiddb/meandre/component/object-to-rdf> as OBJECT_TO_RDF
alias <meandre://fluidinfo.com/fluiddb/meandre/component/print-object> as PRINT_OBJECT
alias <meandre://fluidinfo.com/fluiddb/meandre/component/query-for-objects> as QUERY_FOR_OBJECTS
alias <meandre://fluidinfo.com/fluiddb/meandre/component/reads-the-requested-object> as READS_THE_REQUESTED_OBJECT
alias <meandre://seasr.org/components/tools/wrapped-models-reducer> as WRAPPED_MODELS_REDUCER
alias <meandre://seasr.org/components/tools/model-to-rdf-text> as MODEL_TO_RDF_TEXT
alias <meandre://fluidinfo.com/fluiddb/meandre/component/push-string> as PUSH_STRING

#
# Create the component instances
#
push_query_string = PUSH_STRING()
wrapped_models_reducer = WRAPPED_MODELS_REDUCER()
query_for_objects = QUERY_FOR_OBJECTS()
reads_object = READS_THE_REQUESTED_OBJECT()
model_to_rdf_text = MODEL_TO_RDF_TEXT()
print_rdf_text = PRINT_OBJECT()
object_to_rdf_model = OBJECT_TO_RDF()

#
# Set component properties
#
push_query_string.message = "has fluiddb/tag/path"
query_for_objects.fluiddb_url = "http://sandbox.fluidinfo.com"
eads_object.fluiddb_url = "http://sandbox.fluidinfo.com"
model_to_rdf_text.rdf_dialect = "TTL"

#
# Create the flow by connecting the components
#
@query_for_objects_outputs = query_for_objects()
@model_to_rdf_text_outputs = model_to_rdf_text()
@push_query_string_outputs = push_query_string()
@object_to_rdf_model_outputs = object_to_rdf_model()
@reads_object_outputs = reads_object()
@wrapped_models_reducer_outputs = wrapped_models_reducer()

query_for_objects(text: push_query_string_outputs.text)
model_to_rdf_text(model: wrapped_models_reducer_outputs.model)
object_to_rdf_model(object: reads_object_outputs.object)
reads_object(uuid: query_for_objects_outputs.uuid)[+200!]
print_rdf_text(object: model_to_rdf_text_outputs.text)
wrapped_models_reducer(model: object_to_rdf_model_outputs.model)
</pre>
<p>The only interesting element in the script is the [+200!] entry that creates 200 parallel copies of read object that will concurrently hit FluidDB to pull the data, trying to minimize the latency. The script could be compiled into a MAU and run. The output of the execution would look like the following:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">$ java <span style="color: #339933;">-</span>jar zzre<span style="color: #339933;">-</span>1.4.7.<span style="color: #006633;">jar</span> pull<span style="color: #339933;">-</span>test.<span style="color: #006633;">mau</span> 
Meandre MAU Executor <span style="color: #009900;">&#91;</span>1.0.1vcli<span style="color: #339933;">/</span>1.4.7<span style="color: #009900;">&#93;</span>
All rights reserved by DITA, NCSA, UofI <span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2007</span><span style="color: #339933;">-</span><span style="color: #cc66cc;">2009</span><span style="color: #009900;">&#41;</span>
<span style="color: #000000; font-weight: bold;">THIS</span> SOFTWARE IS PROVIDED UNDER University of Illinois<span style="color: #339933;">/</span>NCSA OPEN SOURCE LICENSE.
&nbsp;
<span style="color: #006633;">Executing</span> MAU file pull<span style="color: #339933;">-</span>test.<span style="color: #006633;">mau</span>
Creating temp dir pull<span style="color: #339933;">-</span>test.<span style="color: #006633;">mau</span>.<span style="color: #006633;">run</span>
Creating temp dir pull<span style="color: #339933;">-</span>test.<span style="color: #006633;">mau</span>.<span style="color: #006633;">public_resources</span>
&nbsp;
Preparing flow<span style="color: #339933;">:</span> meandre<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//seasr.org/zigzag/1253813636945/4416962494019783033/flow/pull-test-mau/</span>
<span style="color: #cc66cc;">2009</span><span style="color: #339933;">-</span>09<span style="color: #339933;">-</span><span style="color: #cc66cc;">24</span> <span style="color: #cc66cc;">12</span><span style="color: #339933;">:</span><span style="color: #cc66cc;">34</span><span style="color: #339933;">:</span><span style="color: #cc66cc;">38.480</span><span style="color: #339933;">::</span>INFO<span style="color: #339933;">:</span>  jetty<span style="color: #339933;">-</span><span style="color: #cc66cc;">6.1</span>.<span style="color: #006633;">x</span>
<span style="color: #cc66cc;">2009</span><span style="color: #339933;">-</span>09<span style="color: #339933;">-</span><span style="color: #cc66cc;">24</span> <span style="color: #cc66cc;">12</span><span style="color: #339933;">:</span><span style="color: #cc66cc;">34</span><span style="color: #339933;">:</span><span style="color: #cc66cc;">38.495</span><span style="color: #339933;">::</span>INFO<span style="color: #339933;">:</span>  Started SocketConnector@0.0.0.0<span style="color: #339933;">:</span><span style="color: #cc66cc;">1715</span>
Preparation completed correctly
&nbsp;
Execution started at<span style="color: #339933;">:</span> <span style="color: #cc66cc;">2009</span><span style="color: #339933;">-</span>09<span style="color: #339933;">-</span>24T12<span style="color: #339933;">:</span><span style="color: #cc66cc;">34</span><span style="color: #339933;">:</span><span style="color: #cc66cc;">38</span>
<span style="color: #339933;">----------------------------------------------------------------------------</span>
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/a24b4a18-5483-47c6-9b62-0955210c7ebd&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute test/Net::FluidDB-name-1253772095.82845-0.944567286499904&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/5ff74371-455b-4299-83f9-ba13ae898ad1&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute test/Net::FluidDB-name-1253622685.3231461-0.437099602163897316&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/67e52346-527e-4bb7-b8f3-05fa8a8ae35b&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute test/Net::FluidDB-name-1253620190.69175-0.861614257420541&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/8a65a184-03d9-4881-95df-02fa0561a86f&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute fluiddb/namespaces/permission/update/exceptions&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/335b44e9-a72f-479d-ad60-3661a35231ba&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute test/Net::FluidDB-name-1253776141.95577-0.284175700598524&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/3bbf1cc6-731c-4e56-a664-adeb5484334f&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute fluiddb/namespaces/permission/delete/policy&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/aba5adcf-fd44-40ab-b702-9cc635650bc3&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute test/Net::FluidDB-name-1253614713.757-0.604769721717702&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
<span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/f61ceb3b-33df-4356-8e7d-c56d3d0ae338&gt;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#type&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/objects/&gt; , &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_1&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/about&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_2&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/path&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/1999/02/22-rdf-syntax-ns#_3&gt;</span>
              <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//sandbox.fluidinfo.com/tags/fluiddb/tags/description&gt; ;</span>
      <span style="color: #339933;">&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//purl.org/dc/elements/1.1/description&gt;</span>
              <span style="color: #0000ff;">&quot;Object for the attribute test/Net::FluidDB-name-1253615887.80879-0.0437609496034099&quot;</span><span style="color: #339933;">^^&lt;</span>http<span style="color: #339933;">:</span><span style="color: #666666; font-style: italic;">//www.w3.org/2001/XMLSchema#string&gt; .</span>
&nbsp;
...</pre></div></div>

<p>That&#8217;s it! A first RDF dump of the query!</p>
<h2>The not so great news</h2>
<p>The current <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> API does not provide any method to be able to pull data from more than one object at once. That basically means, that for each <code>uuid</code> a call to the server needs to be process. That is a huge latency overhead. The <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> guys know about it and they are scratching their heads on how to provide a &#8220;multi get&#8221;. A full trace of the output can be found on this <a href='http://www.xavierllora.net/wp-content/uploads/2009/09/fluiddb.txt'>FluidDB RDF endpoint trace</a>.</p>
<p>This element is crucial for any RDF endpoint. Above I left out a basic element, the time measures. That part looks like:</p>
<pre>
Flow execution statistics

Flow unique execution ID : meandre://seasr.org/zigzag/1253813636945/4416962494019783033/flow/pull-test-mau/8D8E354A/1253813678323/1493255769/
Flow state               : ended
Started at               : Thu Sep 24 12:34:38 CDT 2009
Last update              : Thu Sep 24 12:37:28 CDT 2009
Total run time (ms)      : 170144
</pre>
<p>Basically 170s to pull only 238 objects, where all the time is spent round tripping to <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. </p>
<h2>Getting there</h2>
<p>This basically means that such high latency would not allow efficient interactive usage of the end point. However, this exercise was useful to prof that simple RDF endpoints for <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> are possible and would greatly boost the flexibility of interaction with <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> . The current form of the endpoint is may still have value if you are not in a hurry, allowing you to run <a href="http://www.w3.org/TR/rdf-sparql-query/">SPARQL</a> queries against <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> data and get the best of both worlds.</p>
<h2>The code use</h2>
<p>If you are interested on running the code, you may need <a href="http://seasr.org/meandre">Meandre</a> and the components I put together for the experiment, that you can get from <a href="http://github.com/xllora/liquid">http://github.com/xllora/liquid</a>.</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/' rel='bookmark' title='Liquid: RDF meandering in FluidDB'>Liquid: RDF meandering in FluidDB</a></li>
<li><a href='http://www.xavierllora.net/2009/09/29/temporary-storage-for-meandres-distribute-flow-execution/' rel='bookmark' title='Temporary storage for Meandre&#8217;s distributed flow execution'>Temporary storage for Meandre&#8217;s distributed flow execution</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Liquid: RDF meandering in FluidDB</title>
		<link>http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/</link>
		<comments>http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/#comments</comments>
		<pubDate>Tue, 25 Aug 2009 18:04:20 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Notes]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[FluidDB]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=577</guid>
		<description><![CDATA[Meandre (NCSA pushed data-intensive computing infrastructure) relies on RDF to describe components, flows, locations and repositories. RDF has become the central piece that makes possible Meandre&#8216;s flexibility and reusability. However, one piece still remains largely sketchy and still has no clear optimal solution: How can we facilitate to anybody sharing, publishing and annotating flows, components, [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/' rel='bookmark' title='Liquid: RDF endpoint for FluidDB'>Liquid: RDF endpoint for FluidDB</a></li>
<li><a href='http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/' rel='bookmark' title='Soaring the Clouds with Meandre'>Soaring the Clouds with Meandre</a></li>
<li><a href='http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/' rel='bookmark' title='Meandre 2.0 Alpha Preview = Scala + MongoDB'>Meandre 2.0 Alpha Preview = Scala + MongoDB</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><a href="http://seasr.org/meandre/">Meandre</a> (<a href="http://www.ncsa.illinois.edu">NCSA</a> pushed data-intensive computing infrastructure) relies on <a href="http://www.w3.org/RDF/">RDF</a> to describe components, flows, locations and repositories. <a href="http://www.w3.org/RDF/">RDF</a> has become the central piece that makes possible <a href="http://seasr.org/meandre/">Meandre</a>&#8216;s flexibility and reusability. However, one piece still remains largely sketchy and still has no clear optimal solution: How can we facilitate to anybody sharing, publishing and annotating flows, components, locations and repositories? More importantly, how can that be done in the cloud in an open-ended fashion and allow anybody to annotate and comment on each of the afore mentioned pieces?</p>
<h3>The FluidDB trip</h3>
<p>During my last summer trip to Europe, <a href="http://blogs.fluidinfo.com/terry/">Terry Jones</a> (CEO) invited me to visit <a href="http://www.fluidinfo.com/">FluidInfo</a> (based in Barcelona) where I also meet <a href="http://blogs.fluidinfo.com/esteve/">Esteve Fernandez</a> (CTO). I had a great opportunity to chat with the masterminds behind an intriguing concept I ran into after a short note I received from <a href="http://www.illigal.uiuc.edu/web/deg/vita/">David E. Goldberg</a>. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>, the main product being pushed by <a href="http://www.fluidinfo.com/">FluidInfo</a>, is an online collaborative &#8220;cloud&#8221;  database. On <a href="http://www.fluidinfo.com/">FluidInfo</a> words:</p>
<blockquote><p>
FluidDB lets data be social. It allows almost unlimited information personalization by individual users and applications, and also between them. This makes it simple to build a wide variety of applications that benefit from cooperation, and which are open to unanticipated future enhancements. Even more importantly, FluidDB facilitates and encourages the growth of applications that leave users in control of their own data.
</p></blockquote>
<p><a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> went live on a private alpha last week. The basic concept behind the scenes is simple. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> stores objects. Objects do not belong to anybody. Objects may be &#8220;blank&#8221; or they may be about something (e.g. <a href="http://seasr.org/meandre">http://seasr.org/meandre</a>). You can create as many blank objects as you want. Creating an object with the same about always returns the same object (thus, there will only be one object about <a href="http://seasr.org/meandre">http://seasr.org/meandre</a>). Once objects exists, things start getting more interesting, you can go and tag any object with whatever tag you want. For instance I could tag the <a href="http://seasr.org/meandre">http://seasr.org/meandre</a> object <code>hosted_by</code> tag, and assign the tag the value <a href="http://www.ncsa.illinois.edu>&#8220;National Center for Supercomputing Applications&#8221;</a> value. Values can be anything you want, from text and numerals to blobs. Finally, <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> introduces one last trick: namespaces. For instance, I got xllora. that means that the above tag I mentioned would look like <code>/tag/xllora/hosted_by</code>. You can create as many nested namespaces under your main namespace as you want. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> also provides mechanisms to control who can query and see the values of your created tags.</p>
<p>As you can see, the basic object model and mechanics is very simple. When the alpha went live, <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> only provide access via a simple REST-like HTTP API. In a few days a blossom of client libraries that wrap that API were develop by a dynamic community that gather on <code>#fluiddb</code> channel on <code>irc.freenode.net</code> where <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a and early adopters share experiences. </p>
<h3>You were saying something about RDF</h3>
<p>Back to the point. One thing I chatted with the <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> guys was what did they think about the similarities between <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>&#8216;s object model and RDF. After playing with RDF for a while, the <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> model look awfully familiar, despite a much simplified and manageable model than <a href="http://www.w3.org/RDF/">RDF</a>. They did not have much to say about it, and the question got stuck in the back of my mind. So when I got access to the private alpha, I could not help it but get down the path of what would it mean to map <a href="http://www.w3.org/RDF/">RDF</a> on <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. Yes, the simple straight answer would be to stick serialized RDF into the value of a given tag (e.g. <code>xllora/rdf</code>). However, that option seemed poor, since I could not exploit the social aspect of collaborative annotations provided by <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. So back to the drawing board. What both models have in common: They are both descriptions about something. In RDF you can see those as the subjects of the triple predicates, whereas in <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> those are simple objects. <a href="http://www.w3.org/RDF/">RDF</a> use properties to qualify objects. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> uses tags. Both enable you to add value to qualified objects. Mmh, there you go.</p>
<p>With this idea in mind, I started <a href="http://github.com/xllora/liquid/tree/master">Liquid</a>, a simple proof-of-concept library that maps <a href="http://www.w3.org/RDF/">RDF</a> on to <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> and then it gets it back. There was only one thing that needed a bit of patching. <a href="http://www.w3.org/RDF/">RDF</a> properties are arbitrary URIs. Those could not be easily map on the top of <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> tags, so I took a simple compromise route.</p>
<ul>
<li><a href="http://www.w3.org/RDF/">RDF</a>s subject URIs are mapped onto FluidDB qualified objects via the about tag</li>
<li>One <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> tag will contain all the properties for that object (basically a simple dictionary encoded in JSON)</li>
<li>Reference to other <a href="http://www.w3.org/RDF/">RDF</a> URIs will be mapped on to <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> object URIs, and vice versa</li>
</ul>
<p>Let&#8217;s make it a bit more chewable with a simple example.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;?xml</span> <span style="color: #000066;">version</span>=<span style="color: #ff0000;">&quot;1.0&quot;</span><span style="color: #000000; font-weight: bold;">?&gt;</span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;rdf:RDF</span></span>
<span style="color: #009900;"><span style="color: #000066;">xmlns:rdf</span>=<span style="color: #ff0000;">&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;</span></span>
<span style="color: #009900;"><span style="color: #000066;">xmlns:cd</span>=<span style="color: #ff0000;">&quot;http://www.recshop.fake/cd#&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;rdf:Description</span></span>
<span style="color: #009900;"><span style="color: #000066;">rdf:about</span>=<span style="color: #ff0000;">&quot;http://www.recshop.fake/cd/Empire Burlesque&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;cd:artist<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Bob Dylan<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/cd:artist<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
 <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/rdf:Description<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/rdf:RDF<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>The above <a href="http://www.w3.org/RDF/">RDF</a> represents a single triple</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">http://www.recshop.fake/cd/Empire Burlesque	http://www.recshop.fake/cd#artist	   &quot;Bob Dylan&quot;</pre></div></div>

<p>This triple could be map onto <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> by creating one qualified <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> object and adding the proper tags. The example below shows how to do so using <a href="http://github.com/njr0/fdb.py/tree/master">Python&#8217;s fdb.py client library</a> by <a href="http://StochasticSolutions.com/about.html">Nicholas J. Radcliffe</a>.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> fdb,<span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">version_info</span> <span style="color: #66cc66;">&lt;</span> <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> simplejson <span style="color: #ff7700;font-weight:bold;">as</span> json
<span style="color: #ff7700;font-weight:bold;">else</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> json
&nbsp;
__RDF_TAG__ = <span style="color: #483d8b;">'rdf'</span>
__RDF_TAG_PROPERTIES__  = <span style="color: #483d8b;">'rdf_properties'</span>
__RDF_TAG_MODEL_NAME__ = <span style="color: #483d8b;">'rdf_model_name'</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Initialize the FluidDB client library</span>
<span style="color: #808080; font-style: italic;">#</span>
f = fdb.<span style="color: black;">FluidDB</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Create the tags (if they exist, this won't hurt)</span>
<span style="color: #808080; font-style: italic;">#</span>
f.<span style="color: black;">create_abstract_tag</span><span style="color: black;">&#40;</span>__RDF_TAG__<span style="color: black;">&#41;</span>
f.<span style="color: black;">create_abstract_tag</span><span style="color: black;">&#40;</span>__RDF_TAG_PROPERTIES__<span style="color: black;">&#41;</span>
f.<span style="color: black;">create_abstract_tag</span><span style="color: black;">&#40;</span>__RDF_TAG_MODEL_NAME__<span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Create the subject object of the triple</span>
<span style="color: #808080; font-style: italic;">#	</span>
o = f.<span style="color: black;">create_object</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'http://www.recshop.fake/cd/Empire Burlesque'</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Map RDF properties</span>
<span style="color: #808080; font-style: italic;">#</span>
properties = <span style="color: black;">&#123;</span><span style="color: #483d8b;">'http://www.recshop.fake/cd#artist'</span>:<span style="color: black;">&#91;</span><span style="color: #483d8b;">'Bob Dylan'</span><span style="color: black;">&#93;</span><span style="color: black;">&#125;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Tag the object as RDF aware, properties available, and to which model/named graph </span>
<span style="color: #808080; font-style: italic;"># it belongs</span>
<span style="color: #808080; font-style: italic;">#</span>
f.<span style="color: black;">tag_object_by_id</span><span style="color: black;">&#40;</span>o.<span style="color: #008000;">id</span>, __RDF_TAG__<span style="color: black;">&#41;</span>
f.<span style="color: black;">tag_object_by_id</span><span style="color: black;">&#40;</span>o.<span style="color: #008000;">id</span>,__RDF_TAG_PROPERTIES__,value=json.<span style="color: black;">dumps</span><span style="color: black;">&#40;</span>properties<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
f.<span style="color: black;">tag_object_by_id</span><span style="color: black;">&#40;</span>o.<span style="color: #008000;">id</span>, __RDF_TAG_MODEL_NAME__,<span style="color: #483d8b;">'test_dummy'</span><span style="color: black;">&#41;</span></pre></div></div>

<p>Running along with this basic idea, I quickly stitched <a href="http://github.com/xllora/liquid/tree/master">a simple library (Liquid)</a> that allows ingestion and retrieval of <a href="http://www.w3.org/RDF/">RDF</a> from <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. It is still very rudimentary and may not totally map properly all possible <a href="http://www.w3.org/RDF/">RDF</a>, but it is a working proof-of-concept implementation that it is possible to do so.</p>
<p>The Python code above just saves a triple. You can easy retrieve the triple by performing the following operation</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> fdb,<span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">version_info</span> <span style="color: #66cc66;">&lt;</span> <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> simplejson <span style="color: #ff7700;font-weight:bold;">as</span> json
<span style="color: #ff7700;font-weight:bold;">else</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> json
&nbsp;
__RDF_TAG__ = <span style="color: #483d8b;">'rdf'</span>
__RDF_TAG_PROPERTIES__  = <span style="color: #483d8b;">'rdf_properties'</span>
__RDF_TAG_MODEL_NAME__ = <span style="color: #483d8b;">'rdf_model_name'</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Initialize the FluidDB client library</span>
<span style="color: #808080; font-style: italic;">#</span>
f = fdb.<span style="color: black;">FluidDB</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Retrieve the annotated objects</span>
<span style="color: #808080; font-style: italic;">#</span>
objs = f.<span style="color: black;">query</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'has xllora/%s'</span><span style="color: #66cc66;">%</span><span style="color: black;">&#40;</span>__RDF_TAG__<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Optionally you could retrieve the ones only belonging to a given model by</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># objs = fdb.query('has xllora/%s and xllora/%s matches &quot;%s&quot;'%(__RDF_TAG__,__RDF_TAG_MODEL_NAME__,modelname))</span>
<span style="color: #808080; font-style: italic;">#</span>
subs = <span style="color: black;">&#91;</span>f.<span style="color: black;">get_tag_value_by_id</span><span style="color: black;">&#40;</span>s,<span style="color: #483d8b;">'/tags/fluiddb/about'</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">for</span> s <span style="color: #ff7700;font-weight:bold;">in</span> objs<span style="color: black;">&#93;</span>
props_tmp = <span style="color: black;">&#91;</span>f.<span style="color: black;">get_tag_value_by_id</span><span style="color: black;">&#40;</span>s,<span style="color: #483d8b;">'/tags/xllora/'</span>+__RDF_TAG_PROPERTIES__<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">for</span> s <span style="color: #ff7700;font-weight:bold;">in</span> objs<span style="color: black;">&#93;</span>
props = <span style="color: black;">&#91;</span>json.<span style="color: black;">loads</span><span style="color: black;">&#40;</span>s<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">if</span> s<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>==<span style="color: #ff4500;">200</span> <span style="color: #ff7700;font-weight:bold;">else</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span> <span style="color: #ff7700;font-weight:bold;">for</span> s <span style="color: #ff7700;font-weight:bold;">in</span> props_tmp<span style="color: black;">&#93;</span></pre></div></div>

<p>Now <code>subs</code> contains all the subject URIs for the predicates, and <code>props</code> all the dictionaries containing the properties.</p>
<h3>The bottom line</h3>
<p>OK. So, what is this mapping important? Basically, it will allow collaborative tagging of the created objects (subjects), allowing a collaborative and social gathering of information, besides them mapped <a href="http://www.w3.org/RDF/">RDF</a>. So, what does it all means?</p>
<p>It basically means, that if you do not have the need to ingest <a href="http://www.w3.org/RDF/">RDF</a> (where property URIs are not directly map and you need to Fluidify/reify), any data stored in <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> is already on some form of triplified RDF. Let me explain what I mean by that. Each <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> has a unique URI (e.g. <code>http://fluidDB.fluidinfo.com/objects/4fdf7ff4-f0da-4441-8e63-9b98ed26fc12</code>). Each tag is also uniquely identified by an URI (e.g. <code>http://fluidDB.fluidinfo.com/tags/xllora/rdf_model_name</code>). And finally each pair object/tag may have a value (e.g. a literal <code>'test_dummy'</code> or maybe another URI <code>http://fluidDB.fluidinfo.com/objects/a0dda173-9ee0-4799-a507-8710045d2b07</code>). If a object/tag does not have a value you can just point it to the no value URI (or some other convention you like). </p>
<p>Having said that, now you have all the pieces to express <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> data in plain shareable RDF. That would mean basically get all the tags for and object, query the values, and then just generate and <a href="http://www.w3.org/RDF/">RDF</a> model by adding the gathered triples. That&#8217;s easy. Also, if you align your properties to tags, the ingestion would also become that trivial. I will try to get that piece into <a href="http://github.com/xllora/liquid/tree/master">Liquid</a> as soon as other issues allow me to do so <img src='http://www.xavierllora.net/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> .</p>
<p>Just to close, I would mention once again a key element of this picture. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> opens the door to a truly cooperative, distributed, and online fluid semantic web.  It is one of the first examples of how annotations (a.k.a. metadata) can be easily gathered and used on the &#8220;cloud&#8221; for the masses. Great job guys!</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/' rel='bookmark' title='Liquid: RDF endpoint for FluidDB'>Liquid: RDF endpoint for FluidDB</a></li>
<li><a href='http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/' rel='bookmark' title='Soaring the Clouds with Meandre'>Soaring the Clouds with Meandre</a></li>
<li><a href='http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/' rel='bookmark' title='Meandre 2.0 Alpha Preview = Scala + MongoDB'>Meandre 2.0 Alpha Preview = Scala + MongoDB</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Large Scale Data Mining using Genetics-Based Machine Learning</title>
		<link>http://www.xavierllora.net/2009/07/15/large-scale-data-mining-using-genetics-based-machine-learning/</link>
		<comments>http://www.xavierllora.net/2009/07/15/large-scale-data-mining-using-genetics-based-machine-learning/#comments</comments>
		<pubDate>Wed, 15 Jul 2009 21:56:17 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[GBML & LCS]]></category>
		<category><![CDATA[Learning Classifier Systems]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[data-intensive flows]]></category>
		<category><![CDATA[genetics-based machine learning]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[LCS]]></category>
		<category><![CDATA[map-reduce]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=568</guid>
		<description><![CDATA[Below you may find the slides of the GECCO 2009 tutorial that Jaume Bacardit and I put together. Hope you enjoy it. Slides Abstract We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2006/12/13/observer-invariant-histopathology-using-genetics-based-machine-learning/' rel='bookmark' title='Observer-Invariant Histopathology using Genetics-Based Machine Learning'>Observer-Invariant Histopathology using Genetics-Based Machine Learning</a></li>
<li><a href='http://www.xavierllora.net/2009/04/07/deadline-extended-for-special-issue-on-metaheuristics-for-large-scale-data-mining/' rel='bookmark' title='Deadline extended for special issue on Metaheuristics for Large Scale Data Mining'>Deadline extended for special issue on Metaheuristics for Large Scale Data Mining</a></li>
<li><a href='http://www.xavierllora.net/2007/04/17/machine-learning-statistical-learning-in-r/' rel='bookmark' title='Machine learning &amp; Statistical Learning in R'>Machine learning &#38; Statistical Learning in R</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Below you may find the slides of the <a href="http://www.sigevo.org/gecco-2009/tutorials.html#lsdm">GECCO 2009 tutorial</a> that <a href="http://www.cs.nott.ac.uk/~jqb/">Jaume Bacardit</a> and I put together. Hope you enjoy it.</p>
<p><strong>Slides</strong></p>
<iframe src="http://www.slideshare.net/slideshow/embed_code/1727172" width="425&type=s" height="356" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe><br/><br/>
<p><strong>Abstract</strong></p>
<p>We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes and/or high dimensionality producing human-understandable solutions are key on several domain areas. Genetics-based machine learning (GBML) techniques are perfect candidates for this task, among others, due to the recent advances in representations, learning paradigms, and theoretical modeling. If evolutionary learning techniques aspire to be a relevant player in this context, they need to have the capacity of processing these vast amounts of data and they need to process this data within reasonable time. Moreover, massive computation cycles are getting cheaper and cheaper every day, allowing researchers to have access to unprecedented parallelization degrees. Several topics are interlaced in these two requirements: (1) having the proper learning paradigms and knowledge representations, (2) understanding them and knowing when are they suitable for the problem at hand, (3) using efficiency enhancement techniques, and (4) transforming and visualizing the produced solutions to give back as much insight as possible to the domain experts are few of them.</p>
<p>This tutorial will try to answer this question, following a roadmap that starts with the questions of what large means, and why large is a challenge for GBML methods. Afterwards, we will discuss different facets in which we can overcome this challenge: Efficiency enhancement techniques, representations able to cope with large dimensionality spaces, scalability of learning paradigms. We will also review a topic interlaced with all of them: how can we model the scalability of the components of our GBML systems to better engineer them to get the best performance out of them for large datasets. The roadmap continues with examples of real applications of GBML systems and finishes with an analysis of further directions.</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2006/12/13/observer-invariant-histopathology-using-genetics-based-machine-learning/' rel='bookmark' title='Observer-Invariant Histopathology using Genetics-Based Machine Learning'>Observer-Invariant Histopathology using Genetics-Based Machine Learning</a></li>
<li><a href='http://www.xavierllora.net/2009/04/07/deadline-extended-for-special-issue-on-metaheuristics-for-large-scale-data-mining/' rel='bookmark' title='Deadline extended for special issue on Metaheuristics for Large Scale Data Mining'>Deadline extended for special issue on Metaheuristics for Large Scale Data Mining</a></li>
<li><a href='http://www.xavierllora.net/2007/04/17/machine-learning-statistical-learning-in-r/' rel='bookmark' title='Machine learning &amp; Statistical Learning in R'>Machine learning &#38; Statistical Learning in R</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/07/15/large-scale-data-mining-using-genetics-based-machine-learning/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</title>
		<link>http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/</link>
		<comments>http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/#comments</comments>
		<pubDate>Thu, 29 Jan 2009 19:36:25 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Publications]]></category>
		<category><![CDATA[Technical Reports]]></category>
		<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[eCGA]]></category>
		<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[ZigZag]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=421</guid>
		<description><![CDATA[by Llorà, X. IlliGAL technical report 2009001. You can download the pdf here. More information is also available at the Meandre website as part of the SEASR project. Abstract: Data-intensive computing has positioned itself as a valuable programming paradigm to efficiently approach problems requiring processing very large volumes of data. This paper presents a pilot study about how to apply the [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
<li><a href='http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/' rel='bookmark' title='Scaling eCGA Model Building via Data-Intensive Computing'>Scaling eCGA Model Building via Data-Intensive Computing</a></li>
<li><a href='http://www.xavierllora.net/2008/04/18/meandre-semantic-driven-data-intensive-flow-engine/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flow Engine'>Meandre: Semantic-Driven Data-Intensive Flow Engine</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<div>
<p><em>by</em> Llorà, X.</p>
<p><a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2009001.pdf">IlliGAL technical report 2009001</a>. You can download the pdf <a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2009001.pdf">here</a>. More information is also available at the <a href="http://seasr.org/meandre">Meandre website</a> as part of the <a href="http://seasr.org/">SEASR project</a>.</p>
<p><strong>Abstract: </strong>Data-intensive computing has positioned itself as a valuable programming paradigm to efficiently approach problems requiring processing very large volumes of data. This paper presents a pilot study about how to apply the data-intensive computing paradigm to evolutionary computation algorithms. Two representative cases&#8212;selectorecombinative genetic algorithms and estimation of distribution algorithms&#8212;are presented, analyzed, discussed. This study shows that equivalent data-intensive computing evolutionary computation algorithms can be easily developed, providing robust and scalable algorithms for the multicore-computing era. Experimental results show how such algorithms scale with the number of available cores without further modification.</div>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
<li><a href='http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/' rel='bookmark' title='Scaling eCGA Model Building via Data-Intensive Computing'>Scaling eCGA Model Building via Data-Intensive Computing</a></li>
<li><a href='http://www.xavierllora.net/2008/04/18/meandre-semantic-driven-data-intensive-flow-engine/' rel='bookmark' title='Meandre: Semantic-Driven Data-Intensive Flow Engine'>Meandre: Semantic-Driven Data-Intensive Flow Engine</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Meandre 1.4.0 released, 1.4.1 coming short after</title>
		<link>http://www.xavierllora.net/2009/01/15/meandre-140-released-141-coming-short-after/</link>
		<comments>http://www.xavierllora.net/2009/01/15/meandre-140-released-141-coming-short-after/#comments</comments>
		<pubDate>Thu, 15 Jan 2009 17:22:28 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[meandre]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=400</guid>
		<description><![CDATA[Today, Meandre 1.4.0 has been released. The final release includes the infrastructure, tools (Workbench, ZigZag, etc.), and a set of component and flow repositories. Also, 1.4.1 will be also released later today adding the latest feature updates and few bug fixes on the tools. Related posts: Meandre 1.4.0 final release candidate tagged Is a war [...]
Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/01/11/meandre-140-final-release-candidate-tagged/' rel='bookmark' title='Meandre 1.4.0 final release candidate tagged'>Meandre 1.4.0 final release candidate tagged</a></li>
<li><a href='http://www.xavierllora.net/2007/01/15/is-a-war-for-easily-sharing-your-stuff-with-your-friends-coming/' rel='bookmark' title='Is a war for easily-sharing-your-stuff-with-your-friends coming?'>Is a war for easily-sharing-your-stuff-with-your-friends coming?</a></li>
<li><a href='http://www.xavierllora.net/2008/12/02/meandre-infrastructure-14-rc1-tagged/' rel='bookmark' title='Meandre Infrastructure 1.4 RC1 tagged'>Meandre Infrastructure 1.4 RC1 tagged</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Today, <a href="http://seasr.org/meandre/download/">Meandre 1.4.0 has been released</a>. The final release includes the infrastructure, tools (Workbench, ZigZag, etc.), and a set of component and flow repositories. Also, 1.4.1 will be also released later today adding the latest feature updates and few bug fixes on the tools.</p>
<p>Related posts:<ol>
<li><a href='http://www.xavierllora.net/2009/01/11/meandre-140-final-release-candidate-tagged/' rel='bookmark' title='Meandre 1.4.0 final release candidate tagged'>Meandre 1.4.0 final release candidate tagged</a></li>
<li><a href='http://www.xavierllora.net/2007/01/15/is-a-war-for-easily-sharing-your-stuff-with-your-friends-coming/' rel='bookmark' title='Is a war for easily-sharing-your-stuff-with-your-friends coming?'>Is a war for easily-sharing-your-stuff-with-your-friends coming?</a></li>
<li><a href='http://www.xavierllora.net/2008/12/02/meandre-infrastructure-14-rc1-tagged/' rel='bookmark' title='Meandre Infrastructure 1.4 RC1 tagged'>Meandre Infrastructure 1.4 RC1 tagged</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/01/15/meandre-140-released-141-coming-short-after/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

