<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Xavier Llorà &#187; Research</title>
	<atom:link href="http://www.xavierllora.net/category/research/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.xavierllora.net</link>
	<description>A notebook about data-intensive computing, genetics-based machine learning, semantic-web technology, cloud computing,  and more.</description>
	<lastBuildDate>Thu, 15 Jul 2010 19:50:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Meandre 2.0 Alpha Preview = Scala + MongoDB</title>
		<link>http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/</link>
		<comments>http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 19:45:00 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Meandre]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Crochet]]></category>
		<category><![CDATA[Derby]]></category>
		<category><![CDATA[JENA]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[mongodb]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[Snare]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=697</guid>
		<description><![CDATA[A lot of water under the bridge has gone by since the first release of Meandre 1.4.X series. In January I went back to the drawing board and start sketching what was going to be 1.5.X series. The slide deck embedded above is a extended list of the thoughts during the process. As usual, I [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2010/01/21/fast-rest-api-prototyping-with-crochet-and-scala/' rel='bookmark' title='Permanent Link: Fast REST API prototyping with Crochet and Scala'>Fast REST API prototyping with Crochet and Scala</a></li>
<li><a href='http://www.xavierllora.net/2009/12/01/meandre-is-going-scala/' rel='bookmark' title='Permanent Link: Meandre is going Scala'>Meandre is going Scala</a></li>
<li><a href='http://www.xavierllora.net/2008/12/02/meandre-infrastructure-14-rc1-tagged/' rel='bookmark' title='Permanent Link: Meandre Infrastructure 1.4 RC1 tagged'>Meandre Infrastructure 1.4 RC1 tagged</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><center><br />
<object width="425&type=s" height="348"><param name="movie" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=meandre2-0alpha-preview-100715140140-phpapp01"/><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slideshare.net/swf/ssplayer2.swf?doc=meandre2-0alpha-preview-100715140140-phpapp01"  type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425&type=s" height="348"></embed></object><br />
</center></p>
<p>A lot of water under the bridge has gone by since the first release of <a href="http://seasr.org/meandre/download/">Meandre 1.4.X series</a>. In January I went back to the drawing board and start sketching what was going to be 1.5.X series. The slide deck embedded above is a extended list of the thoughts during the process. As usual, I started collecting feedback from people using 1.4.X in production, things that worked, things that needed improvement, things that were just plain over complicated. The hot recurrent topics that people using 1.4.X could be mainly summarized as:</p>
<ul>
<li>Complex execution concurrency model based on traditional semaphores written in Java (mostly my maintenance nightmare when changes need to be introduced)</li>
<li>Server performance bounded by <a href="http://jena.sourceforge.net/">JENA</a>&#8216;s persistent model implementation</li>
<li>State caching on individual servers to boost performance increases complexity of single-image cluster deployments</li>
<li>Could-deployable infrastructure, but not cloud-friendly infrastructure</li>
</ul>
<p>As I mentioned, these elements where the main ingredients to target for 1.5.X series. However as the redesign moved forward, the new version represented a radical disruption from 1.4.X series and eventually turned up to become the 2.0 Alpha version described here. The main changes that forced this transition are:</p>
<ul>
<li>Cloud-friendly infrastructure required rethinking of the core functionalities</li>
<li>Drastic redesign of the back-end state storage</li>
<li>Revisited flow execution engine to support flow execution</li>
<li>Changes on the API that render returned JSON documents incompatible with 1.4.X</li>
</ul>
<p>Meandre 2.0 (currently already available in the the <a href="http://dev-tools.seasr.org/fisheye/browse/Meandre-Infrastructure">SVN trunk</a>) has been rewritten from scratch using <a href="http://www.scala-lang.org/">Scala</a>. That decision was motivated to benefit from the Actor model provided by <a href="http://www.scala-lang.org/">Scala</a> (modeled after <a href="http://www.erlang.org/">Erlang</a>&#8216;s actors). Such model greatly simplify the mechanics of the infrastructure, but it also powered the basis of Snowfield (the effort to create a scalable distributed flow execution engine for Meandre flows). Also, the <a href="http://www.scala-lang.org/">Scala</a> language expressiveness has greatly reduced the code based size (2.0 code base is roughly 1/3 of the size of 1.4.X series) greatly simplifying the maintenance activities the infrastructure will require as we move forward.</p>
<p>The second big change that pushed the 2.0 Alpha trigger was the redesign of the back end state storage. 1.4.X series heavily relied on the relational storage for persistent RDF models provided by JENA. For performance reasons, <a href="http://jena.sourceforge.net/">JENA</a> caches the model in memory and mostly assumes ownership of the model. Hence, if you want to provide a single-image Meandre cluster you need to inject into <a href="http://jena.sourceforge.net/">JENA</a> cache coherence mechanics, greatly increasing the complexity. Also, the relational implementation relies on the mapping model into a table and triple into a row (this is a bit of a simplification). That implies that large number of SQL statements need to be generated to update models, heavily taxing the relational storage when changes on user repository data needs to be introduced.</p>
<p>An ideal cloud-friendly Meandre infrastructure should not maintain state (neither voluntarily, neither as result of <a href="http://jena.sourceforge.net/">JENA</a> back end). Thus, a fast and scalable back end storage could allow infrastructure servers to maintain no state and be able to provide the appearance of a single image cluster. After testing different alternatives, their community support, and development roadmap, the only option left was <a href="http://www.mongodb.org/">MongoDB</a>. Its setup simplicity for small installations and its ability to easily scale to large installations (including cloud-deployed ones) made <a href="http://www.mongodb.org/">MongoDB</a> the candidate to maintain state for Meandre 2.0. This was quite a departure from 1.4.x series, where you had the choice to store state via <a href="http://jena.sourceforge.net/">JENA</a> on an embedded <a href="http://db.apache.org/derby/">Derby</a> or an external <a href="http://www.mysql.com/">MySQL</a> server.</p>
<p>A final note on the building blocks that made possible 2.0 series. Two other side projects where started to support the development of what will become Meandre 2.0.X series:</p>
<ol>
<li><a href="http://github.com/xllora/Crochet">Crochet</a>: <a href="http://github.com/xllora/Crochet">Crochet</a> targets to help quickly prototype REST APIs relying on the flexibility of the Scala language. The initial ideas for Crochet were inspired after reading Gabriele Renzi post on creating a picoframework with Scala (see <a href="http://www.riffraff.info/2009/4/11/step-a-scala-web-picoframework">http://www.riffraff.info/2009/4/11/step-a-scala-web-picoframework</a>) and the need for quickly prototyping APIs for pilot projects. Crochet also provides mechanisms to hide repetitive tasks involved with default responses and authentication/authorization piggybacking on the mechanics provided by application servers.</li>
<li><a href="http://github.com/xllora/Snare">Snare</a>: <a href="http://github.com/xllora/Snare">Snare</a> is a coordination layer for distributed applications written in Scala and relies and <a href="http://www.mongodb.org/">MongoDB</a> to implement its communication layer. <a href="http://github.com/xllora/Snare">Snare</a> implements a basic heartbeat system and a simple notification mechanism (peer-to-peer and broadcast communication). Snare relies on <a href="http://www.mongodb.org/">MongoDB</a> to track heartbeat and notification mailboxes.</li>
</ol>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2010/01/21/fast-rest-api-prototyping-with-crochet-and-scala/' rel='bookmark' title='Permanent Link: Fast REST API prototyping with Crochet and Scala'>Fast REST API prototyping with Crochet and Scala</a></li>
<li><a href='http://www.xavierllora.net/2009/12/01/meandre-is-going-scala/' rel='bookmark' title='Permanent Link: Meandre is going Scala'>Meandre is going Scala</a></li>
<li><a href='http://www.xavierllora.net/2008/12/02/meandre-infrastructure-14-rc1-tagged/' rel='bookmark' title='Permanent Link: Meandre Infrastructure 1.4 RC1 tagged'>Meandre Infrastructure 1.4 RC1 tagged</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/07/15/meandre-2-0-alpha-preview-scala-mongodb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>IWLCS 2010 &#8211; Discussion session on LCS / XCS(F)</title>
		<link>http://www.xavierllora.net/2010/06/21/iwlcs-2010-discussion-session-on-lcs-xcsf/</link>
		<comments>http://www.xavierllora.net/2010/06/21/iwlcs-2010-discussion-session-on-lcs-xcsf/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 22:04:08 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Learning Classifier Systems]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[GBML]]></category>
		<category><![CDATA[GBML & LCS]]></category>
		<category><![CDATA[genetics-based machine learning]]></category>
		<category><![CDATA[LCS]]></category>
		<category><![CDATA[XCS]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=690</guid>
		<description><![CDATA[I just got an email from Martin Butz about a discussion session being planned for IWLCS 2010 and his request to pass it along. Hope all is well and you are going to attend GECCO this year. Regardless if you attend or not: Jaume asked me to lead a discussion session on “LCS representations, operators, [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2006/05/04/list-of-papers-to-be-presented-at-iwlcs-2006/' rel='bookmark' title='Permanent Link: List of papers to be presented at IWLCS 2006'>List of papers to be presented at IWLCS 2006</a></li>
<li><a href='http://www.xavierllora.net/2008/12/12/join-me-congratulating-albert-orriols/' rel='bookmark' title='Permanent Link: Join me congratulating Albert Orriols, Ph.D.'>Join me congratulating Albert Orriols, Ph.D.</a></li>
<li><a href='http://www.xavierllora.net/2008/07/13/ilwcs-2008-live/' rel='bookmark' title='Permanent Link: ILWCS 2008 live'>ILWCS 2008 live</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I just got an email from <a href="http://www.coboslab.psychologie.uni-wuerzburg.de/people/martin_v_butz/">Martin Butz</a> about a discussion session being planned for IWLCS 2010 and his request to pass it along.</p>
<blockquote><p>Hope all is well and you are going to attend GECCO this year.</p>
<p>Regardless if you attend or not:</p>
<p>Jaume asked me to lead a discussion session on</p>
<p>“LCS representations, operators, and scalability – what is next?”</p>
<p>… or similar during IWLCS… Basically everything besides datamining, because there will be another session on that topic.</p>
<p>So, I am sure you all have some issues in mind that you think should be tackled / addressed / discussed at the workshop and in the near future.</p>
<p>Thus, I would be very happy to receive a few suggestions from your side – anything is welcome – I will then compile the points raised in a few slides to try and get the discussion going at the workshop.</p>
<p>Thank you for any feedback you can provide.</p>
<p>Looking forward to seeing you soon!</p>
<p>Martin</p>
<p>P.S.: Please feel free to also forward this message or tell me, if you think this Email should be still sent to other people…<br />
&#8212;-</p>
<p>PD Dr. Martin V. Butz &lt;<a href="mailto:butz@psychologie.uni-wuerzburg.de" target="_blank">butz@psychologie.uni-wuerzburg.de</a>&gt;</p>
<p>Department of Psychology III (Cognitive Psychology)<br />
Roentgenring 11<br />
97070 Wuerzburg, Germany<br />
<a href="http://www.coboslab.psychologie.uni-wuerzburg.de/people/martin_v_butz/" target="_blank">http://www.coboslab.psychologie.uni-wuerzburg.de/people/martin_v_butz/<br />
</a><a href="http://www.coboslab" target="_blank">http://www.coboslab</a>.<a href="http://psychologie.uni-wuerzburg.de" target="_blank">psychologie.uni-wuerzburg.de<br />
</a>Phone: +49 (0)931 31 82808<br />
Fax:    +49 (0)931 31 82815</p></blockquote>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2006/05/04/list-of-papers-to-be-presented-at-iwlcs-2006/' rel='bookmark' title='Permanent Link: List of papers to be presented at IWLCS 2006'>List of papers to be presented at IWLCS 2006</a></li>
<li><a href='http://www.xavierllora.net/2008/12/12/join-me-congratulating-albert-orriols/' rel='bookmark' title='Permanent Link: Join me congratulating Albert Orriols, Ph.D.'>Join me congratulating Albert Orriols, Ph.D.</a></li>
<li><a href='http://www.xavierllora.net/2008/07/13/ilwcs-2008-live/' rel='bookmark' title='Permanent Link: ILWCS 2008 live'>ILWCS 2008 live</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/06/21/iwlcs-2010-discussion-session-on-lcs-xcsf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GAssist and GALE Now Available in Python</title>
		<link>http://www.xavierllora.net/2010/06/11/gassist-and-gale-now-available-in-python/</link>
		<comments>http://www.xavierllora.net/2010/06/11/gassist-and-gale-now-available-in-python/#comments</comments>
		<pubDate>Fri, 11 Jun 2010 17:06:35 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Learning Classifier Systems]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[GALE]]></category>
		<category><![CDATA[GAssist]]></category>
		<category><![CDATA[GBML]]></category>
		<category><![CDATA[GBML & LCS]]></category>
		<category><![CDATA[LCS]]></category>
		<category><![CDATA[MCS]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[UCS]]></category>
		<category><![CDATA[XCS]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=682</guid>
		<description><![CDATA[Ryan Urbanowicz has released Python versions of GAssits and GALE!!! Yup, so excited to see a new incarnation of GALE doing the rounds. I cannot wait to get my hands on it. Ryan has also done an excellent job porting UCS, XCS, and MCS to Python and making those implementations available via &#8220;LCS &#38; GBML central&#8221; for [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2006/05/10/gale-is-back/' rel='bookmark' title='Permanent Link: GALE is back!'>GALE is back!</a></li>
<li><a href='http://www.xavierllora.net/2008/11/13/fast-mutation-implementation-for-genetic-algorithms-in-python/' rel='bookmark' title='Permanent Link: Fast mutation implementation for genetic algorithms in Python'>Fast mutation implementation for genetic algorithms in Python</a></li>
<li><a href='http://www.xavierllora.net/2009/05/04/transcoding-nigel-2006-videos/' rel='bookmark' title='Permanent Link: Transcoding NIGEL 2006 videos'>Transcoding NIGEL 2006 videos</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.linkedin.com/pub/ryan-urbanowicz/15/3ba/82b">Ryan Urbanowicz</a> has released Python versions of GAssits and GALE!!! Yup, so excited to see a new incarnation of GALE doing the rounds. I cannot wait to get my hands on it. Ryan has also done an excellent job porting UCS, XCS, and MCS to Python and making those implementations available via <a href="http://gbml.org">&#8220;LCS &amp; GBML central&#8221;</a> for people to use. I think Ryan&#8217;s efforts deserve recognition. His code is helping others to have an easier entry to the LCS and GBML.</p>
<p>More information about Ryan&#8217;s implementations can found below</p>
<ul>
<li><a href="http://gbml.org/2010/06/10/python-lcs-implementations-gale-gassist-for-snp-environment/">GAssist and GALE</a></li>
<li><a href="http://gbml.org/2010/03/24/python-lcs-implementations-xcs-ucs-mcs-for-snp-environment/">UCS, XCS and MCS</a></li>
</ul>
<p><em>Side note: my original GALE implementation can also be downloaded </em><a href="http://www.xavierllora.net/2006/05/10/gale-is-back/"><em>here</em></a><em>.</em></p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2006/05/10/gale-is-back/' rel='bookmark' title='Permanent Link: GALE is back!'>GALE is back!</a></li>
<li><a href='http://www.xavierllora.net/2008/11/13/fast-mutation-implementation-for-genetic-algorithms-in-python/' rel='bookmark' title='Permanent Link: Fast mutation implementation for genetic algorithms in Python'>Fast mutation implementation for genetic algorithms in Python</a></li>
<li><a href='http://www.xavierllora.net/2009/05/04/transcoding-nigel-2006-videos/' rel='bookmark' title='Permanent Link: Transcoding NIGEL 2006 videos'>Transcoding NIGEL 2006 videos</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/06/11/gassist-and-gale-now-available-in-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LCS &amp; GBML Central Gets a New Home</title>
		<link>http://www.xavierllora.net/2010/06/03/lcs-gbml-central-get-a-new-home/</link>
		<comments>http://www.xavierllora.net/2010/06/03/lcs-gbml-central-get-a-new-home/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 04:32:38 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Learning Classifier Systems]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[GBML]]></category>
		<category><![CDATA[GBML & LCS]]></category>
		<category><![CDATA[LCS]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=671</guid>
		<description><![CDATA[Today I finished migrating the LCS &#38; GBML Central site from its original URL (http://lcs-gbml.ncsa.uiuc.edu) to a more permanent and stable home located at http://gbml.org. The original site is already currently redirecting the trafic to the new site, and it will be doing so for a while to help people transition and update bookmarks and [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2009/05/13/lcs-gbml-central-back-to-production/' rel='bookmark' title='Permanent Link: LCS &#038; GBML Central back to production'>LCS &#038; GBML Central back to production</a></li>
<li><a href='http://www.xavierllora.net/2009/03/27/lcsweb-gbml-blog-lcs-gbml-central/' rel='bookmark' title='Permanent Link: LCSweb + GBML blog = LCS &amp; GBML Central'>LCSweb + GBML blog = LCS &amp; GBML Central</a></li>
<li><a href='http://www.xavierllora.net/2006/02/12/new-books-section-on-the-lcs-and-gbml-web/' rel='bookmark' title='Permanent Link: New books section on the LCS and GBML web'>New books section on the LCS and GBML web</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Today I finished migrating the <em>LCS &amp; GBML Central</em> site from its original URL (<a href="http://lcs-gbml.ncsa.uiuc.edu">http://lcs-gbml.ncsa.uiuc.edu</a>) to a more permanent and stable home located at <a href="http://gbml.org">http://gbml.org</a>. The original site is already currently redirecting the trafic to the new site, and it will be doing so for a while to help people transition and update bookmarks and feed readers.</p>
<p>I have introduced a few changes to the functionality of the original site. Functional changes can be mostly summarized by (1) dropping the forums section and (2) closing comments on posts and pages. Both functionalities, rarely used  in their current form, have been replaced by a simpler public embedded <a href="https://wave.google.com/">Wave</a> reachable at <a href="http://gbml.org/wave">http://gbml.org/wave</a>. The goal, provide people in the LCS &amp; GBML community a simpler way to discuss, share, and hang out.</p>
<p>About the feeds being aggregated, I have revised the list and added the feeds now available of the table of contents from</p>
<ul>
<li><a href="http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4235">IEEE Transactions on Evolutionary Computation</a></li>
<li><a href="http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4235"></a><a href="http://www.springerlink.com/content/101181/">Spinger&#8217;s Soft Computing</a></li>
<li><a href="http://www.springerlink.com/content/101181/"></a><a href="http://www.springerlink.com/content/108905/">Springer&#8217;s Natural Computing</a></li>
</ul>
<p>I have also added a few other links to relevant research groups doing work on related areas. Please, leave a comment on this post if you know/have a related site that could be aggregated, or if there are missing links to research groups or useful resources.</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2009/05/13/lcs-gbml-central-back-to-production/' rel='bookmark' title='Permanent Link: LCS &#038; GBML Central back to production'>LCS &#038; GBML Central back to production</a></li>
<li><a href='http://www.xavierllora.net/2009/03/27/lcsweb-gbml-blog-lcs-gbml-central/' rel='bookmark' title='Permanent Link: LCSweb + GBML blog = LCS &amp; GBML Central'>LCSweb + GBML blog = LCS &amp; GBML Central</a></li>
<li><a href='http://www.xavierllora.net/2006/02/12/new-books-section-on-the-lcs-and-gbml-web/' rel='bookmark' title='Permanent Link: New books section on the LCS and GBML web'>New books section on the LCS and GBML web</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/06/03/lcs-gbml-central-get-a-new-home/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling eCGA Model Building via Data-Intensive Computing</title>
		<link>http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/</link>
		<comments>http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/#comments</comments>
		<pubDate>Thu, 08 Apr 2010 16:17:39 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Estimation of Distribution Algorithms]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[eCGA]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[map-reduce]]></category>
		<category><![CDATA[mongodb]]></category>
		<category><![CDATA[pro]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=664</guid>
		<description><![CDATA[I just uploaded the technical report of the paper we put together for CEC 2010 on how we can scale up eCGA using a MapReduce approach. The paper, besides exploring the Hadoop implementation, it also presents some very compelling results obtained with MongoDB (a document based store able to perform parallel MapReduce tasks via sharding). [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/' rel='bookmark' title='Permanent Link: Scaling Genetic Algorithms using MapReduce'>Scaling Genetic Algorithms using MapReduce</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>I just uploaded the technical report of the paper we put together for <a href="http://www.wcci2010.org/">CEC 2010</a> on how we can scale up eCGA using a MapReduce approach. The paper, besides exploring the <a href="http://hadoop.apache.org/">Hadoop</a> implementation, it also presents some very compelling results obtained with <a href="http://www.mongodb.org/display/DOCS/Home">MongoDB</a> (a document based store able to perform parallel MapReduce tasks via sharding). The paper is available as <a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2010001.pdf">PDF</a> and <a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2010001.ps.Z">PS</a>.</p>
<p><strong>Abstract:</strong><br />
This paper shows how the extended compact genetic algorithm can be scaled using data-intensive computing techniques such as MapReduce. Two different frameworks (Hadoop and MongoDB) are used to deploy MapReduce implementations of the compact and extended com- pact genetic algorithms. Results show that both are good choices to deal with large-scale problems as they can scale with the number of commodity machines, as opposed to previous ef- forts with other techniques that either required specialized high-performance hardware or shared memory environments.</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/' rel='bookmark' title='Permanent Link: Scaling Genetic Algorithms using MapReduce'>Scaling Genetic Algorithms using MapReduce</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Soaring the Clouds with Meandre</title>
		<link>http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/</link>
		<comments>http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/#comments</comments>
		<pubDate>Mon, 15 Mar 2010 22:55:11 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Notes]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[ZigZag]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=659</guid>
		<description><![CDATA[You may find the slide deck and the abstract for the presentation we delivered today at the &#8220;Data-Intensive Research: how should we improve our ability to use data&#8221; workshop in Edinburgh. Abstract This talk will focus a highly scalable data intensive infrastructure being developed at the National Center for Supercomputing Application (NCSA) at the University [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Permanent Link: Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
<li><a href='http://www.xavierllora.net/2008/03/26/bdcsg2008-clouds-and-manycores-the-revolution-dan-reed/' rel='bookmark' title='Permanent Link: [BDCSG2008] Clouds and ManyCores: The Revolution (Dan Reed)'>[BDCSG2008] Clouds and ManyCores: The Revolution (Dan Reed)</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>You may find the slide deck and the abstract for the presentation we delivered today at the <a href="http://wikis.nesc.ac.uk/escienvoy/Data-Intensive_Research:_how_should_we_improve_our_ability_to_use_data">&#8220;Data-Intensive Research: how should we improve our ability to use data&#8221;</a> workshop in Edinburgh.</p>
<p><center><object width="425&type=s" height="348"><param name="movie" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=dir-workshop-20100315-100315174055-phpapp01"/><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slideshare.net/swf/ssplayer2.swf?doc=dir-workshop-20100315-100315174055-phpapp01"  type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425&type=s" height="348"></embed></object></center></p>
<p><strong>Abstract</strong></p>
<p>This talk will focus a highly scalable data intensive infrastructure being developed at the National Center for Supercomputing Application (NCSA) at the University of Illinois and will introduce current research efforts to tackle the challenges presented by big-data. Research efforts include exploring potential ways of integration between cloud computing concepts—such as Hadoop or Meandre—and traditional HPC technologies and assets. These architecture models contrast significantly, but can be leveraged by building cloud conduits that connect these resources to provide even greater flexibility and scalability on demand. Orchestrating the physical computational environment requires innovative and sophisticated software infrastructure that can transparently take advantage of the functional features and to negotiate the constraints imposed by this diversity of computational resources. Research conducted during the development of the Meandre infrastructure has lead to the production of an agile conductor able to leverage the particular advantages in the physical diversity. It can also be implemented as services and/or in the context of another application benefitting from it reusability, flexibility, and high-scalability. Some example applications and an introduction to the data intensive infrastructure architecture will be presented to provide an overview of the diverse scope of Meandre usages. Finally, a case will be presented showing how software developers and system designers can easily transition to these new paradigms to address the primary data-deluge challenges and to soar to new heights with extreme application scalability using cloud computing concepts.</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Permanent Link: Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
<li><a href='http://www.xavierllora.net/2008/03/26/bdcsg2008-clouds-and-manycores-the-revolution-dan-reed/' rel='bookmark' title='Permanent Link: [BDCSG2008] Clouds and ManyCores: The Revolution (Dan Reed)'>[BDCSG2008] Clouds and ManyCores: The Revolution (Dan Reed)</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2010/03/15/soaring-the-clouds-with-meandre/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GECCO 2010 Submission Deadline (Extended)</title>
		<link>http://www.xavierllora.net/2009/12/19/gecco-2010-submission-deadline/</link>
		<comments>http://www.xavierllora.net/2009/12/19/gecco-2010-submission-deadline/#comments</comments>
		<pubDate>Sat, 19 Dec 2009 12:11:34 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Estimation of Distribution Algorithms]]></category>
		<category><![CDATA[Human-Computer Interaction]]></category>
		<category><![CDATA[Learning Classifier Systems]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[GECCO]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=647</guid>
		<description><![CDATA[If you are planning to submit a paper for the 2010 Genetic and Evolutionary Computation Conference, the deadline is January 13, 2010 (and now extended to January 27th). You can find more information at the GECCO 2010 calendar site. Related posts:GECCO 2009 paper submission deadline extended till January 28 GECCO 2007 deadline extended GECCO 2009 [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2009/01/09/gecco-2009-paper-submission-deadline-extended-till-january-28/' rel='bookmark' title='Permanent Link: GECCO 2009 paper submission deadline extended till January 28'>GECCO 2009 paper submission deadline extended till January 28</a></li>
<li><a href='http://www.xavierllora.net/2007/01/16/gecco-2007-deadline-extended/' rel='bookmark' title='Permanent Link: GECCO 2007 deadline extended'>GECCO 2007 deadline extended</a></li>
<li><a href='http://www.xavierllora.net/2008/11/17/gecco-2009-submission-deadline/' rel='bookmark' title='Permanent Link: GECCO 2009 submission deadline'>GECCO 2009 submission deadline</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>If you are planning to submit a paper for the 2010 Genetic and Evolutionary Computation Conference, the deadline is January 13, 2010 (<strong>and now extended to January 27th</strong>). You can find more information at the <a href="http://www.sigevo.org/gecco-2010/calendar.html">GECCO 2010 calendar site</a>.</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2009/01/09/gecco-2009-paper-submission-deadline-extended-till-january-28/' rel='bookmark' title='Permanent Link: GECCO 2009 paper submission deadline extended till January 28'>GECCO 2009 paper submission deadline extended till January 28</a></li>
<li><a href='http://www.xavierllora.net/2007/01/16/gecco-2007-deadline-extended/' rel='bookmark' title='Permanent Link: GECCO 2007 deadline extended'>GECCO 2007 deadline extended</a></li>
<li><a href='http://www.xavierllora.net/2008/11/17/gecco-2009-submission-deadline/' rel='bookmark' title='Permanent Link: GECCO 2009 submission deadline'>GECCO 2009 submission deadline</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/12/19/gecco-2010-submission-deadline/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling Genetic Algorithms using MapReduce</title>
		<link>http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/</link>
		<comments>http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/#comments</comments>
		<pubDate>Fri, 09 Oct 2009 15:51:19 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Estimation of Distribution Algorithms]]></category>
		<category><![CDATA[Publications]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Technical Reports]]></category>
		<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[map-reduce]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=634</guid>
		<description><![CDATA[Below you may find the abstract to and the link to the technical report of the paper entitled &#8220;Scaling Genetic Algorithms using MapReduce&#8221; that will be presented at the Ninth International Conference on Intelligent Systems Design and Applications (ISDA) 2009 by Verma, A., Llorà, X., Campbell, R.H., Goldberg, D.E. next month. Abstract:Genetic algorithms(GAs) are increasingly [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/' rel='bookmark' title='Permanent Link: Scaling eCGA Model Building via Data-Intensive Computing'>Scaling eCGA Model Building via Data-Intensive Computing</a></li>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Below you may find the abstract to and the link to the technical report of the paper entitled <em>&#8220;Scaling Genetic Algorithms using MapReduce&#8221;</em> that will be presented at the <a href="">Ninth International Conference on Intelligent Systems Design and Applications (ISDA) 2009</a> by Verma, A., Llorà, X., Campbell, R.H., Goldberg, D.E. next month. </p>
<p><strong>Abstract:</strong>Genetic algorithms(GAs) are increasingly being applied to large scale problems. The traditional MPI-based parallel GAs do not scale very well. MapReduce is a powerful abstraction developed by Google for making scalable and fault tolerant applications. In this paper, we mould genetic algorithms into the the MapReduce model. We describe the algorithm design and implementation of GAs on Hadoop, the open source implementation of MapReduce. Our experiments demonstrate the convergence and scalability upto 105 variable problems. Adding more resources would enable us to solve even larger problems without any changes in the algorithms and implementation.</p>
<p>The draft of the paper can be downloaded as <a href="http://www.illigal.uiuc.edu/pub/papers/IlliGALs/2009007.pdf">IlliGAL TR. No. 2009007</a>. For more information see the <a href="http://www.illigal.uiuc.edu/web/technical-reports/2009/10/09/scaling-genetic-algorithms-using-mapreduce/">IlliGAL technical reports web site</a>.</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2010/04/08/scaling-ecga-model-building-via-data-intensive-computing/' rel='bookmark' title='Permanent Link: Scaling eCGA Model Building via Data-Intensive Computing'>Scaling eCGA Model Building via Data-Intensive Computing</a></li>
<li><a href='http://www.xavierllora.net/2009/07/13/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre-2/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre'>Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study using Meandre</a></li>
<li><a href='http://www.xavierllora.net/2009/01/29/data-intensive-computing-for-competent-genetic-algorithms-a-pilot-study-using-meandre/' rel='bookmark' title='Permanent Link: Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre'>Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using  Meandre</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/10/09/scaling-genetic-algorithms-using-mapreduce/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Liquid: RDF meandering in FluidDB</title>
		<link>http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/</link>
		<comments>http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/#comments</comments>
		<pubDate>Tue, 25 Aug 2009 18:04:20 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[Notes]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[FluidDB]]></category>
		<category><![CDATA[meandre]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=577</guid>
		<description><![CDATA[Meandre (NCSA pushed data-intensive computing infrastructure) relies on RDF to describe components, flows, locations and repositories. RDF has become the central piece that makes possible Meandre&#8216;s flexibility and reusability. However, one piece still remains largely sketchy and still has no clear optimal solution: How can we facilitate to anybody sharing, publishing and annotating flows, components, [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/' rel='bookmark' title='Permanent Link: Liquid: RDF endpoint for FluidDB'>Liquid: RDF endpoint for FluidDB</a></li>
<li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Permanent Link: Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
<li><a href='http://www.xavierllora.net/2009/01/11/meandre-140-final-release-candidate-tagged/' rel='bookmark' title='Permanent Link: Meandre 1.4.0 final release candidate tagged'>Meandre 1.4.0 final release candidate tagged</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><a href="http://seasr.org/meandre/">Meandre</a> (<a href="http://www.ncsa.illinois.edu">NCSA</a> pushed data-intensive computing infrastructure) relies on <a href="http://www.w3.org/RDF/">RDF</a> to describe components, flows, locations and repositories. <a href="http://www.w3.org/RDF/">RDF</a> has become the central piece that makes possible <a href="http://seasr.org/meandre/">Meandre</a>&#8216;s flexibility and reusability. However, one piece still remains largely sketchy and still has no clear optimal solution: How can we facilitate to anybody sharing, publishing and annotating flows, components, locations and repositories? More importantly, how can that be done in the cloud in an open-ended fashion and allow anybody to annotate and comment on each of the afore mentioned pieces?</p>
<h3>The FluidDB trip</h3>
<p>During my last summer trip to Europe, <a href="http://blogs.fluidinfo.com/terry/">Terry Jones</a> (CEO) invited me to visit <a href="http://www.fluidinfo.com/">FluidInfo</a> (based in Barcelona) where I also meet <a href="http://blogs.fluidinfo.com/esteve/">Esteve Fernandez</a> (CTO). I had a great opportunity to chat with the masterminds behind an intriguing concept I ran into after a short note I received from <a href="http://www.illigal.uiuc.edu/web/deg/vita/">David E. Goldberg</a>. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>, the main product being pushed by <a href="http://www.fluidinfo.com/">FluidInfo</a>, is an online collaborative &#8220;cloud&#8221;  database. On <a href="http://www.fluidinfo.com/">FluidInfo</a> words:</p>
<blockquote><p>
FluidDB lets data be social. It allows almost unlimited information personalization by individual users and applications, and also between them. This makes it simple to build a wide variety of applications that benefit from cooperation, and which are open to unanticipated future enhancements. Even more importantly, FluidDB facilitates and encourages the growth of applications that leave users in control of their own data.
</p></blockquote>
<p><a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> went live on a private alpha last week. The basic concept behind the scenes is simple. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> stores objects. Objects do not belong to anybody. Objects may be &#8220;blank&#8221; or they may be about something (e.g. <a href="http://seasr.org/meandre">http://seasr.org/meandre</a>). You can create as many blank objects as you want. Creating an object with the same about always returns the same object (thus, there will only be one object about <a href="http://seasr.org/meandre">http://seasr.org/meandre</a>). Once objects exists, things start getting more interesting, you can go and tag any object with whatever tag you want. For instance I could tag the <a href="http://seasr.org/meandre">http://seasr.org/meandre</a> object <code>hosted_by</code> tag, and assign the tag the value <a href="http://www.ncsa.illinois.edu>&#8220;National Center for Supercomputing Applications&#8221;</a> value. Values can be anything you want, from text and numerals to blobs. Finally, <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> introduces one last trick: namespaces. For instance, I got xllora. that means that the above tag I mentioned would look like <code>/tag/xllora/hosted_by</code>. You can create as many nested namespaces under your main namespace as you want. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> also provides mechanisms to control who can query and see the values of your created tags.</p>
<p>As you can see, the basic object model and mechanics is very simple. When the alpha went live, <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> only provide access via a simple REST-like HTTP API. In a few days a blossom of client libraries that wrap that API were develop by a dynamic community that gather on <code>#fluiddb</code> channel on <code>irc.freenode.net</code> where <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a and early adopters share experiences. </p>
<h3>You were saying something about RDF</h3>
<p>Back to the point. One thing I chatted with the <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> guys was what did they think about the similarities between <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>&#8216;s object model and RDF. After playing with RDF for a while, the <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> model look awfully familiar, despite a much simplified and manageable model than <a href="http://www.w3.org/RDF/">RDF</a>. They did not have much to say about it, and the question got stuck in the back of my mind. So when I got access to the private alpha, I could not help it but get down the path of what would it mean to map <a href="http://www.w3.org/RDF/">RDF</a> on <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. Yes, the simple straight answer would be to stick serialized RDF into the value of a given tag (e.g. <code>xllora/rdf</code>). However, that option seemed poor, since I could not exploit the social aspect of collaborative annotations provided by <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. So back to the drawing board. What both models have in common: They are both descriptions about something. In RDF you can see those as the subjects of the triple predicates, whereas in <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> those are simple objects. <a href="http://www.w3.org/RDF/">RDF</a> use properties to qualify objects. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> uses tags. Both enable you to add value to qualified objects. Mmh, there you go.</p>
<p>With this idea in mind, I started <a href="http://github.com/xllora/liquid/tree/master">Liquid</a>, a simple proof-of-concept library that maps <a href="http://www.w3.org/RDF/">RDF</a> on to <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> and then it gets it back. There was only one thing that needed a bit of patching. <a href="http://www.w3.org/RDF/">RDF</a> properties are arbitrary URIs. Those could not be easily map on the top of <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> tags, so I took a simple compromise route.</p>
<ul>
<li><a href="http://www.w3.org/RDF/">RDF</a>s subject URIs are mapped onto FluidDB qualified objects via the about tag</li>
<li>One <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> tag will contain all the properties for that object (basically a simple dictionary encoded in JSON)</li>
<li>Reference to other <a href="http://www.w3.org/RDF/">RDF</a> URIs will be mapped on to <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> object URIs, and vice versa</li>
</ul>
<p>Let&#8217;s make it a bit more chewable with a simple example.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;?xml</span> <span style="color: #000066;">version</span>=<span style="color: #ff0000;">&quot;1.0&quot;</span><span style="color: #000000; font-weight: bold;">?&gt;</span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;rdf:RDF</span></span>
<span style="color: #009900;"><span style="color: #000066;">xmlns:rdf</span>=<span style="color: #ff0000;">&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;</span></span>
<span style="color: #009900;"><span style="color: #000066;">xmlns:cd</span>=<span style="color: #ff0000;">&quot;http://www.recshop.fake/cd#&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;rdf:Description</span></span>
<span style="color: #009900;"><span style="color: #000066;">rdf:about</span>=<span style="color: #ff0000;">&quot;http://www.recshop.fake/cd/Empire Burlesque&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;cd:artist<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Bob Dylan<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/cd:artist<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
 <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/rdf:Description<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/rdf:RDF<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>The above <a href="http://www.w3.org/RDF/">RDF</a> represents a single triple</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">http://www.recshop.fake/cd/Empire Burlesque	http://www.recshop.fake/cd#artist	   &quot;Bob Dylan&quot;</pre></div></div>

<p>This triple could be map onto <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> by creating one qualified <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> object and adding the proper tags. The example below shows how to do so using <a href="http://github.com/njr0/fdb.py/tree/master">Python&#8217;s fdb.py client library</a> by <a href="http://StochasticSolutions.com/about.html">Nicholas J. Radcliffe</a>.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> fdb,<span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">version_info</span> <span style="color: #66cc66;">&lt;</span> <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> simplejson <span style="color: #ff7700;font-weight:bold;">as</span> json
<span style="color: #ff7700;font-weight:bold;">else</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> json
&nbsp;
__RDF_TAG__ = <span style="color: #483d8b;">'rdf'</span>
__RDF_TAG_PROPERTIES__  = <span style="color: #483d8b;">'rdf_properties'</span>
__RDF_TAG_MODEL_NAME__ = <span style="color: #483d8b;">'rdf_model_name'</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Initialize the FluidDB client library</span>
<span style="color: #808080; font-style: italic;">#</span>
f = fdb.<span style="color: black;">FluidDB</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Create the tags (if they exist, this won't hurt)</span>
<span style="color: #808080; font-style: italic;">#</span>
f.<span style="color: black;">create_abstract_tag</span><span style="color: black;">&#40;</span>__RDF_TAG__<span style="color: black;">&#41;</span>
f.<span style="color: black;">create_abstract_tag</span><span style="color: black;">&#40;</span>__RDF_TAG_PROPERTIES__<span style="color: black;">&#41;</span>
f.<span style="color: black;">create_abstract_tag</span><span style="color: black;">&#40;</span>__RDF_TAG_MODEL_NAME__<span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Create the subject object of the triple</span>
<span style="color: #808080; font-style: italic;">#	</span>
o = f.<span style="color: black;">create_object</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'http://www.recshop.fake/cd/Empire Burlesque'</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Map RDF properties</span>
<span style="color: #808080; font-style: italic;">#</span>
properties = <span style="color: black;">&#123;</span><span style="color: #483d8b;">'http://www.recshop.fake/cd#artist'</span>:<span style="color: black;">&#91;</span><span style="color: #483d8b;">'Bob Dylan'</span><span style="color: black;">&#93;</span><span style="color: black;">&#125;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Tag the object as RDF aware, properties available, and to which model/named graph </span>
<span style="color: #808080; font-style: italic;"># it belongs</span>
<span style="color: #808080; font-style: italic;">#</span>
f.<span style="color: black;">tag_object_by_id</span><span style="color: black;">&#40;</span>o.<span style="color: #008000;">id</span>, __RDF_TAG__<span style="color: black;">&#41;</span>
f.<span style="color: black;">tag_object_by_id</span><span style="color: black;">&#40;</span>o.<span style="color: #008000;">id</span>,__RDF_TAG_PROPERTIES__,value=json.<span style="color: black;">dumps</span><span style="color: black;">&#40;</span>properties<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
f.<span style="color: black;">tag_object_by_id</span><span style="color: black;">&#40;</span>o.<span style="color: #008000;">id</span>, __RDF_TAG_MODEL_NAME__,<span style="color: #483d8b;">'test_dummy'</span><span style="color: black;">&#41;</span></pre></div></div>

<p>Running along with this basic idea, I quickly stitched <a href="http://github.com/xllora/liquid/tree/master">a simple library (Liquid)</a> that allows ingestion and retrieval of <a href="http://www.w3.org/RDF/">RDF</a> from <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a>. It is still very rudimentary and may not totally map properly all possible <a href="http://www.w3.org/RDF/">RDF</a>, but it is a working proof-of-concept implementation that it is possible to do so.</p>
<p>The Python code above just saves a triple. You can easy retrieve the triple by performing the following operation</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> fdb,<span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">version_info</span> <span style="color: #66cc66;">&lt;</span> <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> simplejson <span style="color: #ff7700;font-weight:bold;">as</span> json
<span style="color: #ff7700;font-weight:bold;">else</span>:
    <span style="color: #ff7700;font-weight:bold;">import</span> json
&nbsp;
__RDF_TAG__ = <span style="color: #483d8b;">'rdf'</span>
__RDF_TAG_PROPERTIES__  = <span style="color: #483d8b;">'rdf_properties'</span>
__RDF_TAG_MODEL_NAME__ = <span style="color: #483d8b;">'rdf_model_name'</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Initialize the FluidDB client library</span>
<span style="color: #808080; font-style: italic;">#</span>
f = fdb.<span style="color: black;">FluidDB</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Retrieve the annotated objects</span>
<span style="color: #808080; font-style: italic;">#</span>
objs = f.<span style="color: black;">query</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'has xllora/%s'</span><span style="color: #66cc66;">%</span><span style="color: black;">&#40;</span>__RDF_TAG__<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># Optionally you could retrieve the ones only belonging to a given model by</span>
<span style="color: #808080; font-style: italic;">#</span>
<span style="color: #808080; font-style: italic;"># objs = fdb.query('has xllora/%s and xllora/%s matches &quot;%s&quot;'%(__RDF_TAG__,__RDF_TAG_MODEL_NAME__,modelname))</span>
<span style="color: #808080; font-style: italic;">#</span>
subs = <span style="color: black;">&#91;</span>f.<span style="color: black;">get_tag_value_by_id</span><span style="color: black;">&#40;</span>s,<span style="color: #483d8b;">'/tags/fluiddb/about'</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">for</span> s <span style="color: #ff7700;font-weight:bold;">in</span> objs<span style="color: black;">&#93;</span>
props_tmp = <span style="color: black;">&#91;</span>f.<span style="color: black;">get_tag_value_by_id</span><span style="color: black;">&#40;</span>s,<span style="color: #483d8b;">'/tags/xllora/'</span>+__RDF_TAG_PROPERTIES__<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">for</span> s <span style="color: #ff7700;font-weight:bold;">in</span> objs<span style="color: black;">&#93;</span>
props = <span style="color: black;">&#91;</span>json.<span style="color: black;">loads</span><span style="color: black;">&#40;</span>s<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">if</span> s<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>==<span style="color: #ff4500;">200</span> <span style="color: #ff7700;font-weight:bold;">else</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span> <span style="color: #ff7700;font-weight:bold;">for</span> s <span style="color: #ff7700;font-weight:bold;">in</span> props_tmp<span style="color: black;">&#93;</span></pre></div></div>

<p>Now <code>subs</code> contains all the subject URIs for the predicates, and <code>props</code> all the dictionaries containing the properties.</p>
<h3>The bottom line</h3>
<p>OK. So, what is this mapping important? Basically, it will allow collaborative tagging of the created objects (subjects), allowing a collaborative and social gathering of information, besides them mapped <a href="http://www.w3.org/RDF/">RDF</a>. So, what does it all means?</p>
<p>It basically means, that if you do not have the need to ingest <a href="http://www.w3.org/RDF/">RDF</a> (where property URIs are not directly map and you need to Fluidify/reify), any data stored in <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> is already on some form of triplified RDF. Let me explain what I mean by that. Each <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> has a unique URI (e.g. <code>http://fluidDB.fluidinfo.com/objects/4fdf7ff4-f0da-4441-8e63-9b98ed26fc12</code>). Each tag is also uniquely identified by an URI (e.g. <code>http://fluidDB.fluidinfo.com/tags/xllora/rdf_model_name</code>). And finally each pair object/tag may have a value (e.g. a literal <code>'test_dummy'</code> or maybe another URI <code>http://fluidDB.fluidinfo.com/objects/a0dda173-9ee0-4799-a507-8710045d2b07</code>). If a object/tag does not have a value you can just point it to the no value URI (or some other convention you like). </p>
<p>Having said that, now you have all the pieces to express <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> data in plain shareable RDF. That would mean basically get all the tags for and object, query the values, and then just generate and <a href="http://www.w3.org/RDF/">RDF</a> model by adding the gathered triples. That&#8217;s easy. Also, if you align your properties to tags, the ingestion would also become that trivial. I will try to get that piece into <a href="http://github.com/xllora/liquid/tree/master">Liquid</a> as soon as other issues allow me to do so <img src='http://www.xavierllora.net/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> .</p>
<p>Just to close, I would mention once again a key element of this picture. <a href="http://www.fluidinfo.com/fluiddb">FluidDB</a> opens the door to a truly cooperative, distributed, and online fluid semantic web.  It is one of the first examples of how annotations (a.k.a. metadata) can be easily gathered and used on the &#8220;cloud&#8221; for the masses. Great job guys!</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2009/09/24/liquid-rdf-endpoint-for-fluiddb/' rel='bookmark' title='Permanent Link: Liquid: RDF endpoint for FluidDB'>Liquid: RDF endpoint for FluidDB</a></li>
<li><a href='http://www.xavierllora.net/2008/11/15/meandre-semantic-driven-data-intensive-flows-in-the-clouds/' rel='bookmark' title='Permanent Link: Meandre: Semantic-Driven Data-Intensive Flows in the Clouds'>Meandre: Semantic-Driven Data-Intensive Flows in the Clouds</a></li>
<li><a href='http://www.xavierllora.net/2009/01/11/meandre-140-final-release-candidate-tagged/' rel='bookmark' title='Permanent Link: Meandre 1.4.0 final release candidate tagged'>Meandre 1.4.0 final release candidate tagged</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/08/25/liquid-rdf-meandering-in-fluiddb/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Large Scale Data Mining using Genetics-Based Machine Learning</title>
		<link>http://www.xavierllora.net/2009/07/15/large-scale-data-mining-using-genetics-based-machine-learning/</link>
		<comments>http://www.xavierllora.net/2009/07/15/large-scale-data-mining-using-genetics-based-machine-learning/#comments</comments>
		<pubDate>Wed, 15 Jul 2009 21:56:17 +0000</pubDate>
		<dc:creator>Xavier</dc:creator>
				<category><![CDATA[Data-Intensive Computing]]></category>
		<category><![CDATA[GBML & LCS]]></category>
		<category><![CDATA[Learning Classifier Systems]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[data-intensive flows]]></category>
		<category><![CDATA[genetics-based machine learning]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[LCS]]></category>
		<category><![CDATA[map-reduce]]></category>

		<guid isPermaLink="false">http://www.xavierllora.net/?p=568</guid>
		<description><![CDATA[Below you may find the slides of the GECCO 2009 tutorial that Jaume Bacardit and I put together. Hope you enjoy it. Slides Abstract We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope [...]


Related posts:<ol><li><a href='http://www.xavierllora.net/2006/12/13/observer-invariant-histopathology-using-genetics-based-machine-learning/' rel='bookmark' title='Permanent Link: Observer-Invariant Histopathology using Genetics-Based Machine Learning'>Observer-Invariant Histopathology using Genetics-Based Machine Learning</a></li>
<li><a href='http://www.xavierllora.net/2009/04/07/deadline-extended-for-special-issue-on-metaheuristics-for-large-scale-data-mining/' rel='bookmark' title='Permanent Link: Deadline extended for special issue on Metaheuristics for Large Scale Data Mining'>Deadline extended for special issue on Metaheuristics for Large Scale Data Mining</a></li>
<li><a href='http://www.xavierllora.net/2008/03/26/bdcsg2008-algorithmic-perspectives-on-large-scale-social-network-data-jon-kleinberg/' rel='bookmark' title='Permanent Link: [BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)'>[BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Below you may find the slides of the <a href="http://www.sigevo.org/gecco-2009/tutorials.html#lsdm">GECCO 2009 tutorial</a> that <a href="http://www.cs.nott.ac.uk/~jqb/">Jaume Bacardit</a> and I put together. Hope you enjoy it.</p>
<p><strong>Slides</strong></p>
<object width="425&type=s" height="348"><param name="movie" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=gecco2009largegbmltutorial-090715163244-phpapp01"/><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slideshare.net/swf/ssplayer2.swf?doc=gecco2009largegbmltutorial-090715163244-phpapp01"  type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425&type=s" height="348"></embed></object>
<p><strong>Abstract</strong></p>
<p>We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes and/or high dimensionality producing human-understandable solutions are key on several domain areas. Genetics-based machine learning (GBML) techniques are perfect candidates for this task, among others, due to the recent advances in representations, learning paradigms, and theoretical modeling. If evolutionary learning techniques aspire to be a relevant player in this context, they need to have the capacity of processing these vast amounts of data and they need to process this data within reasonable time. Moreover, massive computation cycles are getting cheaper and cheaper every day, allowing researchers to have access to unprecedented parallelization degrees. Several topics are interlaced in these two requirements: (1) having the proper learning paradigms and knowledge representations, (2) understanding them and knowing when are they suitable for the problem at hand, (3) using efficiency enhancement techniques, and (4) transforming and visualizing the produced solutions to give back as much insight as possible to the domain experts are few of them.</p>
<p>This tutorial will try to answer this question, following a roadmap that starts with the questions of what large means, and why large is a challenge for GBML methods. Afterwards, we will discuss different facets in which we can overcome this challenge: Efficiency enhancement techniques, representations able to cope with large dimensionality spaces, scalability of learning paradigms. We will also review a topic interlaced with all of them: how can we model the scalability of the components of our GBML systems to better engineer them to get the best performance out of them for large datasets. The roadmap continues with examples of real applications of GBML systems and finishes with an analysis of further directions.</p>


<p>Related posts:<ol><li><a href='http://www.xavierllora.net/2006/12/13/observer-invariant-histopathology-using-genetics-based-machine-learning/' rel='bookmark' title='Permanent Link: Observer-Invariant Histopathology using Genetics-Based Machine Learning'>Observer-Invariant Histopathology using Genetics-Based Machine Learning</a></li>
<li><a href='http://www.xavierllora.net/2009/04/07/deadline-extended-for-special-issue-on-metaheuristics-for-large-scale-data-mining/' rel='bookmark' title='Permanent Link: Deadline extended for special issue on Metaheuristics for Large Scale Data Mining'>Deadline extended for special issue on Metaheuristics for Large Scale Data Mining</a></li>
<li><a href='http://www.xavierllora.net/2008/03/26/bdcsg2008-algorithmic-perspectives-on-large-scale-social-network-data-jon-kleinberg/' rel='bookmark' title='Permanent Link: [BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)'>[BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.xavierllora.net/2009/07/15/large-scale-data-mining-using-genetics-based-machine-learning/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
