<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Building a Vector Space Search Engine in Python</title>
	<atom:link href="http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html</link>
	<description>on AI, The Web, Usability, Testing &#38; Software process</description>
	<lastBuildDate>Wed, 27 Jan 2010 02:14:31 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: JS</title>
		<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/comment-page-1#comment-1082</link>
		<dc:creator>JS</dc:creator>
		<pubDate>Mon, 08 Dec 2008 20:26:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.joesniff.co.uk/wordpress/projects/building-a-vector-space-search-engine-in-python.html#comment-1082</guid>
		<description>If the query contains unseen terms in training documents, your code will break. For example, if the query was &quot;cat shannon&quot;, i.e. vectorSpace.search([&quot;cat&quot;, &quot;shannon&quot;])), the &quot;vector[self.vectorKeywordIndex[word]]&quot; line of makeVector will break since &quot;shannon&quot; is not indexed.

20</description>
		<content:encoded><![CDATA[<p>If the query contains unseen terms in training documents, your code will break. For example, if the query was &#8220;cat shannon&#8221;, i.e. vectorSpace.search(["cat", "shannon"])), the &#8220;vector[self.vectorKeywordIndex[word]]&#8221; line of makeVector will break since &#8220;shannon&#8221; is not indexed.</p>
<p>20</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Giuliani Vito, Ivan</title>
		<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/comment-page-1#comment-1071</link>
		<dc:creator>Giuliani Vito, Ivan</dc:creator>
		<pubDate>Sat, 06 Dec 2008 15:44:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.joesniff.co.uk/wordpress/projects/building-a-vector-space-search-engine-in-python.html#comment-1071</guid>
		<description>You might be interested in django-searchable (http://code.google.com/p/django-searchable). It is a django application that uses a vector space model with tf-idf weighting. It&#039;s experimental stuff but you can already see how tf-idf weighting works really.</description>
		<content:encoded><![CDATA[<p>You might be interested in django-searchable (<a href="http://code.google.com/p/django-searchable" rel="nofollow">http://code.google.com/p/django-searchable</a>). It is a django application that uses a vector space model with tf-idf weighting. It&#8217;s experimental stuff but you can already see how tf-idf weighting works really.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joseph Wilk</title>
		<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/comment-page-1#comment-895</link>
		<dc:creator>Joseph Wilk</dc:creator>
		<pubDate>Thu, 13 Nov 2008 21:25:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.joesniff.co.uk/wordpress/projects/building-a-vector-space-search-engine-in-python.html#comment-895</guid>
		<description>Thanks for spotting the error. There was a problem with the PorterStemmer.py having been truncated. I&#039;ve updated the version at Github  (&lt;a href=&quot;http://github.com/josephwilk/semanticpy/tree/master&quot; rel=&quot;nofollow&quot;&gt;http://github.com/josephwilk/semanticpy/tree/master&lt;/a&gt;) and that works now.</description>
		<content:encoded><![CDATA[<p>Thanks for spotting the error. There was a problem with the PorterStemmer.py having been truncated. I&#8217;ve updated the version at Github  (<a href="http://github.com/josephwilk/semanticpy/tree/master" rel="nofollow">http://github.com/josephwilk/semanticpy/tree/master</a>) and that works now.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ronin1770</title>
		<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/comment-page-1#comment-893</link>
		<dc:creator>ronin1770</dc:creator>
		<pubDate>Thu, 13 Nov 2008 11:43:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.joesniff.co.uk/wordpress/projects/building-a-vector-space-search-engine-in-python.html#comment-893</guid>
		<description>hi,

i am trying to run this code - however i am getting the following error

Traceback (most recent call last):
  File &quot;E:\TayaIT_Dev\vector_model\VectorSpace.py&quot;, line 93, in 
    vectorSpace= VectorSpace(documents)
  File &quot;E:\TayaIT_Dev\vector_model\VectorSpace.py&quot;, line 25, in __init__
    self.build(documents)
  File &quot;E:\TayaIT_Dev\vector_model\VectorSpace.py&quot;, line 30, in build
    self.vectorKeywordIndex = self.getVectorKeywordIndex(documents)
  File &quot;E:\TayaIT_Dev\vector_model\VectorSpace.py&quot;, line 41, in getVectorKeywordIndex
    vocabularyList = self.parser.tokenise(vocabularyString)
  File &quot;E:\TayaIT_Dev\vector_model\Parser.py&quot;, line 36, in tokenise
    return [self.stemmer.stem(word,0,len(word)-1) for word in words]
AttributeError: PorterStemmer instance has no attribute &#039;stem&#039;

any ideas?

thanx in advance</description>
		<content:encoded><![CDATA[<p>hi,</p>
<p>i am trying to run this code &#8211; however i am getting the following error</p>
<p>Traceback (most recent call last):<br />
  File &#8220;E:\TayaIT_Dev\vector_model\VectorSpace.py&#8221;, line 93, in<br />
    vectorSpace= VectorSpace(documents)<br />
  File &#8220;E:\TayaIT_Dev\vector_model\VectorSpace.py&#8221;, line 25, in __init__<br />
    self.build(documents)<br />
  File &#8220;E:\TayaIT_Dev\vector_model\VectorSpace.py&#8221;, line 30, in build<br />
    self.vectorKeywordIndex = self.getVectorKeywordIndex(documents)<br />
  File &#8220;E:\TayaIT_Dev\vector_model\VectorSpace.py&#8221;, line 41, in getVectorKeywordIndex<br />
    vocabularyList = self.parser.tokenise(vocabularyString)<br />
  File &#8220;E:\TayaIT_Dev\vector_model\Parser.py&#8221;, line 36, in tokenise<br />
    return [self.stemmer.stem(word,0,len(word)-1) for word in words]<br />
AttributeError: PorterStemmer instance has no attribute &#8217;stem&#8217;</p>
<p>any ideas?</p>
<p>thanx in advance</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: links for 2008-07-03 &#171; Breyten&#8217;s Dev Blog</title>
		<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/comment-page-1#comment-416</link>
		<dc:creator>links for 2008-07-03 &#171; Breyten&#8217;s Dev Blog</dc:creator>
		<pubDate>Thu, 03 Jul 2008 11:32:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.joesniff.co.uk/wordpress/projects/building-a-vector-space-search-engine-in-python.html#comment-416</guid>
		<description>[...] Building a Vector Space Search Engine in Python (tags: code python search) [...]</description>
		<content:encoded><![CDATA[<p>[...] Building a Vector Space Search Engine in Python (tags: code python search) [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dineshv</title>
		<link>http://blog.josephwilk.net/projects/building-a-vector-space-search-engine-in-python.html/comment-page-1#comment-306</link>
		<dc:creator>dineshv</dc:creator>
		<pubDate>Fri, 30 May 2008 16:12:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.joesniff.co.uk/wordpress/projects/building-a-vector-space-search-engine-in-python.html#comment-306</guid>
		<description>Wrt my previous note the download zip file only included a portion of the PorterStemmer.py file.  Anyway, I was just interested in a Python stemming solution which I got from http://tartarus.org/~martin/PorterStemmer/python.txt.  Cheers!

Dinesh</description>
		<content:encoded><![CDATA[<p>Wrt my previous note the download zip file only included a portion of the PorterStemmer.py file.  Anyway, I was just interested in a Python stemming solution which I got from <a href="http://tartarus.org/~martin/PorterStemmer/python.txt" rel="nofollow">http://tartarus.org/~martin/PorterStemmer/python.txt</a>.  Cheers!</p>
<p>Dinesh</p>
]]></content:encoded>
	</item>
</channel>
</rss>
