<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: lxml.html</title>
	<atom:link href="http://blog.ianbicking.org/2007/09/24/lxmlhtml/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/</link>
	<description></description>
	<lastBuildDate>Fri, 03 Sep 2010 01:53:06 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Catalin Festila</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-171895</link>
		<dc:creator>Catalin Festila</dc:creator>
		<pubDate>Mon, 30 Aug 2010 14:13:27 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-171895</guid>
		<description>I test your example . I have a question about this.
If the result is another page (see your source &quot;result = parse(submit_form(f)).getroot()&quot; ).
And this page has another form , how is correct way to repeat your example to use the new form ?
I try with :
page2=parse(result).getroot()
but i got some errors . 
Thank&#039;s</description>
		<content:encoded><![CDATA[<p>I test your example . I have a question about this.
If the result is another page (see your source &#8220;result = parse(submit_form(f)).getroot()&#8221; ).
And this page has another form , how is correct way to repeat your example to use the new form ?
I try with :
page2=parse(result).getroot()
but i got some errors . 
Thank&#8217;s</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian Bicking</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1126</link>
		<dc:creator>Ian Bicking</dc:creator>
		<pubDate>Wed, 26 Sep 2007 00:18:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1126</guid>
		<description>Re: testing -- there&#039;s a property to access the lxml document in WebTest (an extraction of paste.fixture): http://pythonpaste.org/webtest/#parsing-the-body</description>
		<content:encoded><![CDATA[<p>Re: testing &#8212; there&#8217;s a property to access the lxml document in WebTest (an extraction of paste.fixture): <a href="http://pythonpaste.org/webtest/#parsing-the-body" rel="nofollow">http://pythonpaste.org/webtest/#parsing-the-body</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kumar McMillan</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1122</link>
		<dc:creator>Kumar McMillan</dc:creator>
		<pubDate>Tue, 25 Sep 2007 18:58:05 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1122</guid>
		<description>oh, cool, I see it&#039;s already in lxml&#039;s trunk :)</description>
		<content:encoded><![CDATA[<p>oh, cool, I see it&#8217;s already in lxml&#8217;s trunk :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kumar McMillan</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1121</link>
		<dc:creator>Kumar McMillan</dc:creator>
		<pubDate>Tue, 25 Sep 2007 18:52:47 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1121</guid>
		<description>Hi Ian, I just wanted to say thanks for hacking on this and I look forward to seeing it merged into lxml&#039;s trunk.  I&#039;ve deployed it from the branch and have been using it for a side project quite successfully.  Ironically, it&#039;s a screen scraper.  I tried using templatemaker on the scrapee but its template is obfuscated (my theory anyway) so it [breaks templatemaker pretty bad](http://code.google.com/p/templatemaker/issues/detail?id=4).  Anyway, xpath seems cleaner and more maintainable.  Plus, it is very useful to have HTMLElement.text_content() so that the template can be analyzed in a more contextual way.

I&#039;ve also been starting to work with paste (via pylons) a little more than just experimentation and it might be nice to use xpath for assertions on paste.fixture&#039;s response.  However, I haven&#039;t needed more than the provided mustcontain() method yet so maybe it would be overkill, just a thought.  

Xpath would certainly be a clean, maintainable way to &quot;validate&quot; a template.  That is, answer : Do all my pages implement the common header/footer/nav-bar layout for the site?  I&#039;m not sure if it is reasonable to go to such lengths in ones tests, this again is just a thought.</description>
		<content:encoded><![CDATA[<p>Hi Ian, I just wanted to say thanks for hacking on this and I look forward to seeing it merged into lxml&#8217;s trunk.  I&#8217;ve deployed it from the branch and have been using it for a side project quite successfully.  Ironically, it&#8217;s a screen scraper.  I tried using templatemaker on the scrapee but its template is obfuscated (my theory anyway) so it <a href="http://code.google.com/p/templatemaker/issues/detail?id=4">breaks templatemaker pretty bad</a>.  Anyway, xpath seems cleaner and more maintainable.  Plus, it is very useful to have HTMLElement.text_content() so that the template can be analyzed in a more contextual way.</p>

<p>I&#8217;ve also been starting to work with paste (via pylons) a little more than just experimentation and it might be nice to use xpath for assertions on paste.fixture&#8217;s response.  However, I haven&#8217;t needed more than the provided mustcontain() method yet so maybe it would be overkill, just a thought.  </p>

<p>Xpath would certainly be a clean, maintainable way to &#8220;validate&#8221; a template.  That is, answer : Do all my pages implement the common header/footer/nav-bar layout for the site?  I&#8217;m not sure if it is reasonable to go to such lengths in ones tests, this again is just a thought.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jgraham</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1101</link>
		<dc:creator>jgraham</dc:creator>
		<pubDate>Mon, 24 Sep 2007 22:18:32 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1101</guid>
		<description>&gt; It would be nice to have something similar with html5lib.

It appears lxml has grown a HTML 5-incompatible idea desire to enforce &quot;valid&quot; tag names. In particular it seems things like lxml.etree.Element(&quot;foo?bar&quot;) now throw a value errors even though HTML 5 parsers are expected to be able to create elements called &quot;foo?bar&quot;. Not only does this mean that not all HTML 5 trees can be represented in lxml.etree, but it means that our hack of using a HTML 5-illegal element name to represent the notional document root doesn&#039;t work any more. If this problem can be overcome, html5lib already supports generating lxml trees, so it would be easy to wrap it in syntax like lxml.html.html5lib.parse().</description>
		<content:encoded><![CDATA[<p>&gt; It would be nice to have something similar with html5lib.</p>

<p>It appears lxml has grown a HTML 5-incompatible idea desire to enforce &#8220;valid&#8221; tag names. In particular it seems things like lxml.etree.Element(&#8220;foo?bar&#8221;) now throw a value errors even though HTML 5 parsers are expected to be able to create elements called &#8220;foo?bar&#8221;. Not only does this mean that not all HTML 5 trees can be represented in lxml.etree, but it means that our hack of using a HTML 5-illegal element name to represent the notional document root doesn&#8217;t work any more. If this problem can be overcome, html5lib already supports generating lxml trees, so it would be easy to wrap it in syntax like lxml.html.html5lib.parse().</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martijn Faassen</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1099</link>
		<dc:creator>Martijn Faassen</dc:creator>
		<pubDate>Mon, 24 Sep 2007 19:44:23 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1099</guid>
		<description>I&#039;m happy to see all this great work appear in lxml! lxml is still my baby too and I&#039;m very proud. :) Thanks Ian for these great contributions and thanks Stefan for being such an excellent lead developer for lxml!</description>
		<content:encoded><![CDATA[<p>I&#8217;m happy to see all this great work appear in lxml! lxml is still my baby too and I&#8217;m very proud. :) Thanks Ian for these great contributions and thanks Stefan for being such an excellent lead developer for lxml!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian Bicking</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1094</link>
		<dc:creator>Ian Bicking</dc:creator>
		<pubDate>Mon, 24 Sep 2007 17:26:54 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1094</guid>
		<description>Fredrik: I&#039;m not sure.  Obviously the first part is parsing HTML, but there are BeautifulSoup parsers and native serialization in ET 1.3.  I use `el.getparent()` sometimes, and though not a tremendous amount I&#039;ve found it&#039;s hard to refactor when you don&#039;t have that pointer to the parent, as in ElementTree.  Also, many of these use XPath very heavily.  You could probably rewrite several of them to be simpler (ET-compatible) expressions plus a simple list comprehension.

A couple might be more feasible than the others: the differ, maybe formfill, and definitely the doctest comparison stuff would be okay (since it is based on code that was written for ElementTree originally).</description>
		<content:encoded><![CDATA[<p>Fredrik: I&#8217;m not sure.  Obviously the first part is parsing HTML, but there are BeautifulSoup parsers and native serialization in ET 1.3.  I use <code>el.getparent()</code> sometimes, and though not a tremendous amount I&#8217;ve found it&#8217;s hard to refactor when you don&#8217;t have that pointer to the parent, as in ElementTree.  Also, many of these use XPath very heavily.  You could probably rewrite several of them to be simpler (ET-compatible) expressions plus a simple list comprehension.</p>

<p>A couple might be more feasible than the others: the differ, maybe formfill, and definitely the doctest comparison stuff would be okay (since it is based on code that was written for ElementTree originally).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fredrik</title>
		<link>http://blog.ianbicking.org/2007/09/24/lxmlhtml/comment-page-1/#comment-1093</link>
		<dc:creator>Fredrik</dc:creator>
		<pubDate>Mon, 24 Sep 2007 16:13:16 +0000</pubDate>
		<guid isPermaLink="false">http://blog.ianbicking.org/2007/09/24/lxmlhtml/#comment-1093</guid>
		<description>So how hard would it be to make the relevant portions of this work also for xml.etree ?</description>
		<content:encoded><![CDATA[<p>So how hard would it be to make the relevant portions of this work also for xml.etree ?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.174 seconds -->
