<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4821566683798191899</id><updated>2011-10-04T12:55:01.091-07:00</updated><category term='facebook'/><category term='eclipse'/><category term='social-registration'/><category term='django'/><category term='virtualenv'/><category term='debugging'/><category term='python'/><category term='webfaction'/><category term='Pinax'/><category term='Avatar'/><title type='text'>Programmer's Notes and Random Random Thoughts</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>13</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-6161070708522608274</id><published>2011-01-30T12:12:00.000-08:00</published><updated>2011-02-03T05:24:44.003-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Pinax'/><category scheme='http://www.blogger.com/atom/ns#' term='facebook'/><category scheme='http://www.blogger.com/atom/ns#' term='Avatar'/><title type='text'>Facebook and Pinax Avatars</title><content type='html'>Pinax is bundled with an app for managing avatars. If your users are uploading their photos, then using the avatar app is pretty straight forward. If you are grabbing a avatar from a users Facebook site, then things are not as easy.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The photo can be accessed using:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;file = urllib.urlopen("https://graph.facebook.com/%s/picture?type=normal"%(user['id'],))&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The problem comes when you try to save the photo to an avatar using something like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;avatar=Avatar(user=request.user)&lt;/div&gt;&lt;div&gt;&lt;div&gt;new_file = avatar.avatar.save(path, file.read())&lt;/div&gt;&lt;div&gt;avatar.save()&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I was able to get it to work by saving it to a temp file, then wrapping it in a django File:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;from tempfile import NamedTemporaryFile&lt;/div&gt;&lt;div&gt;from django.core.files import File&lt;/div&gt;&lt;/div&gt;&lt;div&gt;from avatar.models import Avatar,avatar_file_path&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;file = urllib.urlopen("https://graph.facebook.com/%s/picture?type=normal"%(user['id'],))     fp=NamedTemporaryFile(delete=True) &lt;/div&gt;&lt;div&gt;fp.write(file.read()) &lt;/div&gt;&lt;div&gt;avatar=Avatar(user=request.user) &lt;/div&gt;&lt;div&gt;path = avatar_file_path(user=request.user,filename='facebook_%s.jpg'%date.today()) &lt;/div&gt;&lt;div&gt;avatar.avatar.save(os.path.join(settings.MEDIA_ROOT,path),File(fp))&lt;/div&gt;&lt;div&gt;avatar.save()&lt;/div&gt;&lt;div&gt;fp.close()&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I tried using StringIO, but could not get it to work.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-6161070708522608274?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/6161070708522608274/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/facebook-and-pinax-avatars.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/6161070708522608274'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/6161070708522608274'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/facebook-and-pinax-avatars.html' title='Facebook and Pinax Avatars'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-3281525954341373247</id><published>2011-01-23T13:09:00.000-08:00</published><updated>2011-01-30T12:11:57.032-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='social-registration'/><category scheme='http://www.blogger.com/atom/ns#' term='django'/><category scheme='http://www.blogger.com/atom/ns#' term='facebook'/><title type='text'>Django Social-registration and the Facebook API</title><content type='html'>I am creating a Pinax based website that will require users to have an account and to login. I wanted users with Facebook accounts to be able to use OAuth and Facebook to create their account on my site. I also wanted to use data from Facebook to fill in as much of the user profile as possible. To accomplish this, I installed the app&lt;a href="fp=open('facebook.jpg','wb')"&gt; social-registration&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Facebook provides basic info such as first name and last name without extended permissions. Email address requires extended access. &lt;a href="http://developers.facebook.com/docs/authentication/"&gt;Extended acces&lt;/a&gt;s can be requested using the scope parameter during facebook login.&lt;br /&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Social-registration has some javascript (templates/socialregistration/facebook_js.html ) that you can put in your templates to allow users to login with facebook. In that js file, the code that adds the scope parameter is commented out:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;FB.login(handleResponse/*,{perms:'publish_stream,sms,offline_access,email,read_stream,status_update,etc'}*/);&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Uncommenting that code solves part of the problem. However, the perms list is really just a placeholder. status_update and etc are not &lt;a href="http://developers.facebook.com/docs/authentication/permissions"&gt;valid perms&lt;/a&gt;, but the rest are. Remove those and it should work.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you put the social-registration middleware in your settings, then you can access all the users facebook info in your views using:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;data_dict=request.facebook.graph.get_object('me')&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-3281525954341373247?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/3281525954341373247/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/facebook-api-and-pinax.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3281525954341373247'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3281525954341373247'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/facebook-api-and-pinax.html' title='Django Social-registration and the Facebook API'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-3616680942968352457</id><published>2011-01-06T16:07:00.000-08:00</published><updated>2011-01-06T16:42:44.564-08:00</updated><title type='text'>Pinax Static Media</title><content type='html'>&lt;blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;/blockquote&gt;Pinax 0.7 uses the django app called staticfiles to manage...well... static files. This app is installed in the virtualenv site-packages directory.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The thing that makes this app a little hard to understand at first is it's not very DRY. The way it works is you scatter your static files all over your django project (with rules about where each dir should be). In your settings.py file you create a variable called STATICFILES_DIRS. It looks something like this:&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;span class="Apple-style-span"   &gt;STATICFILES_DIRS = (&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"   &gt;      ('pinax', os.path.join(PINAX_ROOT, 'media', PINAX_THEME)),&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"   &gt;      ('skatepark', os.path.join(PROJECT_ROOT, 'media')),)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;Then you run:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   &gt;python manage.py build_media --all&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Note this command is different in version 0.9. This command goes to each dir and copies all the subdirs and files into the dir specified by the settings.py variable STATIC_ROOT. For example:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;&lt;span class="Apple-style-span"   &gt;STATIC_ROOT = os.path.join(PROJECT_ROOT, 'site_media', 'static')&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;I guess in the end, putting all your static files in one place makes it easier to server them on the production server. &lt;/span&gt;At this point in my project, the value of the staticfiles app is it makes it easy to over-ride default static media. In my case I wanted to over-ride the pinax logo.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The default was at:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   &gt;/home/me/envs/skatepark/lib/python2.7/site-packages/pinax/media/default/pinax/images/logo.png&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;I put the logo I wanted in my project dir:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   &gt;/home/me/skatepark/media/pinax/images/logo.png&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The command&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;python manage.py build_media --all&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;started with the first dir in STATICFILES_DIR being copied and the default logo being written to:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   &gt;/home/chuck/skatepark/site_media/static/pinax/images/logo.png&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;then the next dir in STATICFILES_DIR  was scanned and the default logo was overwritten by my logo.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-3616680942968352457?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/3616680942968352457/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/pinax-static-media.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3616680942968352457'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3616680942968352457'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/pinax-static-media.html' title='Pinax Static Media'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-69192699587319399</id><published>2011-01-02T07:21:00.000-08:00</published><updated>2011-01-02T07:50:45.371-08:00</updated><title type='text'>Testing Facebook API on Local Host</title><content type='html'>I am developing a django site using &lt;a href="https://github.com/flashingpumpkin/django-socialregistration"&gt;django-socialregistration&lt;/a&gt;. I am using the django development server.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I setup the facebook app via &lt;a href="http://www.facebook.com/developers/apps.php"&gt;http://www.facebook.com/developers/apps.php&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This page let me get the API_KEY and SECRET_KEY.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Without setting any other settings, I got this error:&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;span&gt;&lt;span&gt;&lt;br /&gt;&lt;span class="Apple-style-span"   &gt;API Error Code: 191&lt;br /&gt;API Error Description: The specified URL is not owned by the application&lt;br /&gt;Error Message: redirect_uri is not owned by the application.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;The facebook app settings include settings for:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Site URL&lt;/li&gt;&lt;li&gt;Site Domain&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;When I set:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Site URL: http://127.0.0.1:8000/&lt;/div&gt;&lt;div&gt;Site Domain: &lt;blank&gt;&lt;/blank&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I was able to test the api.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-69192699587319399?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/69192699587319399/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/testing-facebook-api-on-local-host.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/69192699587319399'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/69192699587319399'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2011/01/testing-facebook-api-on-local-host.html' title='Testing Facebook API on Local Host'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-4974724552792949285</id><published>2010-11-11T09:39:00.000-08:00</published><updated>2010-11-11T09:51:03.816-08:00</updated><title type='text'>Problem Migrating Multilingual to Django 1.2</title><content type='html'>I just upgraded a multilingual Django project to v1.2.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To upgrade to the new version of the multilingual app, I switch to the &lt;a href="https://github.com/ojii/django-multilingual-ng"&gt;django-multilingual-ng&lt;/a&gt; branch. That branch involved some changes in the database structure.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As recommended, I installed the app &lt;a href="http://south.aeracode.org/"&gt;south&lt;/a&gt; to migrate the database. Then I ran the mlng_convert command on my apps that use multilingual.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The problem I encountered was caused by apps whose models refer to other apps via foreign keys. Evidently south and mlng_convert follows those relationships and updates those table as well. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If I ran mlng_convert on a table that was already converted, I got the database error (MySQL) "Duplicate column name 'language_code'"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-4974724552792949285?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/4974724552792949285/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/11/problem-migrating-multilingual-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/4974724552792949285'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/4974724552792949285'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/11/problem-migrating-multilingual-to.html' title='Problem Migrating Multilingual to Django 1.2'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-6243051304672054824</id><published>2010-09-02T07:26:00.000-07:00</published><updated>2010-09-02T07:39:15.464-07:00</updated><title type='text'>More Webfaction, virtualenv --no-site-packages</title><content type='html'>I found a way to get the --no-site-packages flag for virtualenv to work "good enuf" on webfaction. The solution involved reading &lt;a href="http://forum.webfaction.com/viewtopic.php?id=3529"&gt;this forum discussion&lt;/a&gt; more carefully. The mojo (hack) is in post #7.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My plan is to put packages that are of general use and that rarely change in ~/lib/python2.X. These will not be available in my virtualenv. For standard packages that I want in the virtualenv, I will manually add paths to them using the virtualenvwrapper command add2virtualenv.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-6243051304672054824?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/6243051304672054824/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/09/more-webfaction-virtualenv-no-site.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/6243051304672054824'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/6243051304672054824'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/09/more-webfaction-virtualenv-no-site.html' title='More Webfaction, virtualenv --no-site-packages'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-8270623417172582539</id><published>2010-08-29T14:00:00.000-07:00</published><updated>2010-09-02T07:26:12.448-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtualenv'/><category scheme='http://www.blogger.com/atom/ns#' term='webfaction'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Webfaction, virtualenv and --no-site-packages</title><content type='html'>So I thought I had finally figured out how I wanted to configure my code on webfaction using env. My plan involved using the flag --no-site-packages.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once I created the virtualenv, I fired up python and ran the code:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;import sys&lt;/div&gt;&lt;div&gt;for x in sys.path:&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;print x&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Much to my surprise, all "my site packages" (/username/lib/python2.5 ) paths were in the system path. I was hoping these packages would not be included. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://forum.webfaction.com/viewtopic.php?id=3529"&gt;This post on the webfaction forum&lt;/a&gt; explained what was happening. Basically the --no-site-packages flag removes the ones installed by webfaction for all users. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Next, I tried to setup the paths using a sitecustomize.py file. When python starts, it looks for that file. If it finds it, it imports it. Unfortunately that did not work either because webfaction already has one in /usr/local/lib/python2.5.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I tried adding an import statement to a .pth file. That solution did not work because there is no guarantee of when that file will be executed relative to the other .pth files. In my case, it ran before the paths I wanted to remove from sys.path.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For now, here's my soln. The only project I have running in virtualenv is django. I am just going to add an import statement in manage.py. It will configure the site path. On my development computer, I am not having all these problems. So I will put the import statement in a try block. If the config program is not found, it will just move on.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-8270623417172582539?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/8270623417172582539/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/08/webfaction-virtualenv-and-no-site.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/8270623417172582539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/8270623417172582539'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/08/webfaction-virtualenv-and-no-site.html' title='Webfaction, virtualenv and --no-site-packages'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-3532282110127392922</id><published>2010-08-27T14:41:00.000-07:00</published><updated>2010-08-29T14:00:43.782-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtualenv'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>virtualenv and --no-site-packages</title><content type='html'>I originally created my virtualenv without the flag --no-site-packages. I figured that the virtualenv paths would over-ride (come before) the global paths. That turned out to be wrong. The global paths come first.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One way to get around this is the create the env with the --no-site-packages flag and then manually add the global packages that I want using the virtualenvwrapper command add2virtualenv. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since the env was already setup, I looked for ways to add that flag after the repo was created. &lt;a href="http://stackoverflow.com/questions/3371136/revert-the-no-site-packages-option-with-virtualenv"&gt;Here is the answer.&lt;/a&gt; To have virtualenv not use global libs, just add the file "no-global-site-packages.txt" to the path: virtualenv_name/lib/python2.5 - Note this path is different from the one given on stackoverflow. I figured out the path by creating a test virtualenv and seeing where that file turned up.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But that's not really what I want to do. I want to have my env access the global paths because there are a lot of useful libs in there. It would be a lot of work manually adding each to the virtualenv. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Part of the answer seems to be in the pth files. &lt;a href="http://stackoverflow.com/questions/1860348/virtualenv-global-site-packages-vs-the-site-packages-in-the-virtual-environment"&gt;It turns out that when you install a package in the virtual env using easy_install, easy_install creates a .pth file that removes versions of that package that are already on the path&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A quick peak in easy-install.pth reveals lines of code in addition to the paths! Here it is:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;import sys; sys.__plen = len(sys.path)&lt;/div&gt;&lt;div&gt;&lt;div&gt;./setuptools-0.6c11-py2.5.egg&lt;/div&gt;&lt;div&gt;./pip-0.7.2-py2.5.egg&lt;/div&gt;&lt;div&gt;import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;I found &lt;a href="http://nedbatchelder.com/blog/201001/running_code_at_python_startup.html"&gt;more info on this magic here&lt;/a&gt;. Apparently if you start a line with import and put all the code after it on the same line using ";" as a delimiter, then you can hack the path at startup.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-3532282110127392922?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/3532282110127392922/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/08/virtualenv-and-no-site-packages.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3532282110127392922'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3532282110127392922'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/08/virtualenv-and-no-site-packages.html' title='virtualenv and --no-site-packages'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-5831149944215011121</id><published>2010-07-28T08:00:00.000-07:00</published><updated>2010-07-28T08:09:05.191-07:00</updated><title type='text'>Even More Setting Up Eclipse for Python and Django</title><content type='html'>A while back, I punted on this effort. I cannot remember the details, but eventually things would get so buggered, that I would have to restart Eclipse. There was no discernible pattern. Everything would work fine, then my browser would stop seeing the Django dev server. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Re-starting Python processes did not help. Restarting the browser did not help. Switching browsers did not help. So I decided I do not need break points.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I am still using Eclipse. I like its auto complete and syntax checking. I like its user interface. But now I run the Django dev server in a command line window. &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-5831149944215011121?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/5831149944215011121/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/07/even-more-setting-up-eclipse-for-python.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/5831149944215011121'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/5831149944215011121'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/07/even-more-setting-up-eclipse-for-python.html' title='Even More Setting Up Eclipse for Python and Django'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-7685165704804417237</id><published>2010-07-28T05:54:00.000-07:00</published><updated>2010-11-11T09:39:18.952-08:00</updated><title type='text'>Error: pywintypes.py DLL load failed</title><content type='html'>Some of my wxpython routines that used to work, stopped working. The error message was:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;Traceback (most recent call last):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "boot_com_servers.py", line 21, in &lt;module&gt;&lt;/module&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "c:\python25\lib\site-packages\pythoncom.py", line 3, in &lt;module&gt;&lt;/module&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    pywintypes.__import_pywin32_system_module__("pythoncom", globals())&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "C:\Python25\lib\site-packages\win32\lib\pywintypes.py", line 100, in __import_pywin32_system_module__&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    ('.dll', 'rb', imp.C_EXTENSION))&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;ImportError: DLL load failed: The specified procedure could not be found.&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you look at pywintypes.py, you will see that it is a major hack that goes through all sorts of contortions to find the dlls. Here are some of the comments:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;  &lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  # This has been through a number of iterations.  The problem: how to &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # locate pywintypesXX.dll when it may be in a number of places, and how&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # to avoid ever loading it twice.  This problem is compounded by the&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # fact that the "right" way to do this requires win32api, but this&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # itself requires pywintypesXX.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # And the killer problem is that someone may have done 'import win32api'&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # before this code is called.  In that case Windows will have already&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # loaded pywintypesXX as part of loading win32api - but by the time&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # we get here, we may locate a different one.  This appears to work, but&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # then starts raising bizarre TypeErrors complaining that something&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    # is not a pywintypes type when it clearly is!&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My hat is off to those that had the patience to deal with this BS.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The DLLs in question are:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;pythoncom25.dll&lt;/li&gt;&lt;li&gt;pywintypes25.dll&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Anyway - I added a few print statements to pywintypes.py and found that it was trying to load the dll from my recent installation of TortoiseHg. Those DLLs were in the TortoiseHg folder, but they were version 2.5.212.0. The version of those files in my python installation was 2.5.210.0. I checked the win32 download site and they are at version 2.5.214.0.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I was hoping this problem would be solved in the new version, so I installed it. That updated the DLLs in C:\WINDOWS\system32 and C:\Python25\Lib\site-packages\pywin32_system32 but the problem remained.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I tried deleting all the python related DLLs in TortoiseHg. This created a new error:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;Traceback (most recent call last):&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "boot_com_servers.py", line 21, in &lt;module&gt;&lt;/module&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "C:\Python25\lib\site-packages\pythoncom.py", line 2, in &lt;module&gt;&lt;/module&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    import pywintypes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "C:\Python25\lib\site-packages\win32\lib\pywintypes.py", line 124, in &lt;module&gt;&lt;/module&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    __import_pywin32_system_module__("pywintypes", globals())&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;  File "C:\Python25\lib\site-packages\win32\lib\pywintypes.py", line 61, in __import_pywin32_system_module__&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;    raise ImportError("Module '%s' isn't in frozen sys.path %s" % (modname, sys.path))&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;ImportError: Module 'pywintypes' isn't in frozen sys.path&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If TortoiseHg messes with the python DLL's then it probably causes lots of problems with other python programs. A quick google search shows that TortoiseHg also messes up PyScripter. It seems that problem can be eliminated by upgrading TortoiseHg to 0.8 or higher.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="  ;font-family:arial, sans-serif;font-size:13px;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;span&gt;&lt;span&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;"The incompatibility between PyScripter and the TortoiseHg Mercurial addon has finally been fixed with the release of TortoiseHg 0.8, which replaces the Python Windows shell extensions with C++ based ones"&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;&lt;span class="Apple-style-span" style="color:#000000;"&gt;I am using TortoiseHg 0.7.6. The current version is 1.1.1. Since I almost never use TortoiseHg, I decided to uninstall it. Uninstalling it made the problem go away. Hopefully by the time I need to reinstall TortoiseHg, these problems will all be worked out.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;&lt;span class="Apple-style-span" style="color:#000000;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;&lt;span class="Apple-style-span" style="color:#000000;"&gt;***Update Nov 11,2010***&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;&lt;span class="Apple-style-span" style="color:#000000;"&gt;I have started using Hg again. This time it is for my own project. I like it much better than SVN. I reinstalled TortoiseHg 1.1.2. The problems described above did not reappear.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color:#FF0000;"&gt;&lt;span class="Apple-style-span" style="color:#000000;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-7685165704804417237?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/7685165704804417237/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/07/error-pywintypespy-dll-load-failed.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/7685165704804417237'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/7685165704804417237'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/07/error-pywintypespy-dll-load-failed.html' title='Error: pywintypes.py DLL load failed'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-3486331023581993333</id><published>2010-03-24T07:35:00.000-07:00</published><updated>2010-07-28T07:59:51.393-07:00</updated><title type='text'>Hacking PDF with Python</title><content type='html'>I recently did a project that involved extracting table data from some PDF's create by the UN. Here is &lt;a href="http://www.unctad.org/sections/dite_fdistat/docs/wid_cp_am_en.pdf"&gt;an example.&lt;/a&gt; My first suggestion to my client was that he just contact the UN and see if he could get the data in some other format such as csv or excel. Much to my surprise, the UN had no interest in helping a college professor do some research - that no doubt will benefit humanity. So we were stuck with pdfs.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I had created pdfs with python before, but I had never extracted info from a pdf with python. So I did a quick google search and found the python package - &lt;a href="http://www.unixuser.org/~euske/python/pdfminer/index.html"&gt;pdfminer&lt;/a&gt;  (when all went to hell later, subsequent searches showed that is the only game in town - for programs based in python). pdfminer outputs the results as text, html or xml.  Great - certainly one of those formats would do the trick. So I told my client and we agreed that I should give it a shot.&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My first approach was to convert to html. I envisioned tags such as table, tr, td, etc... Then I encountered the horror of pdf. As you may recall, pdf has been around for a long time. Long before the notion of semantic markup. As the name implies, the goal of "portable document format" is to make documents look EXACTLY the same on all platforms. Hence, pdf is more of a vector graphics format than information format. For example, if you have the letter "a" somewhere in a word, that is part of a sentence, pdf will specify the font, font-size and coordinates of the letter, but it might not know that the letter is part of a word which is part of a sentence. If you are a graphics designer, that is great, because it allows you to do all sorts of fine manipulations, such as kerning. But if you are trying to extract info from the document - it is a major headache. Also, pdf has no concept of a table. To pdf, a table is just a bunch of lines and letters.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So what does pdfminer give you when you request the output at html? A bunch of &lt;span&gt; tags that attempt to define the position of the text - cannot remember the details because I switched to XML almost immediately.  XML was not much better, but the structure was slightly easier to process.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At this point, I contacted the client, told him that things were much more complicated than I had first thought and asked him how he wanted to proceed. He suggested we try a commercially available program that some of his colleagues had some luck with. This seemed like a good idea to me. Thousands of people before me must have faced this problem. There must be hundreds of millions of pdfs on this planet by now. Why re-invent the wheel?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The program we chose was &lt;a href="http://www.nuance.com/imaging/products/pdfconverter."&gt;PDF Converter 6 by Nuance&lt;/a&gt;. It was about $50. Had a nice batch mode, so it was easy to convert all 125 pdf files into excel. Even better, it has a mode where it recognizes tables and will create an excel file where each sheet is a table. All the other clutter is filtered out! The biggest problem was that the excel sheets did not have the table titles. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At this point, the solution I attempted was to convert all the pdfs using PDF Converter. Then convert the pdfs again to text files using pdfminer. I would use regex to extract all the table titles and then apply them to the excel sheets.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This did not work for two reasons. First, the text in the file created by pdfminer was not in any particular order. To make matters worse, words and sentences were split up. It was a mess, with no info to put things back together again.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Second, PDF Converter is not very good at recognizing tables. Making things worse, it did not consistently fail on one particular table type. It was impossible to know which table it failed to find. Thus, for a pdf, I might get 8 sheets in the excel file but find 12 table titles. So it was not easy to figure out which title went with which sheet.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...urgh...months have passed...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I have not kept up the gory details of this post. Sorry. I wish I had, but it got too depressing. This project really ate my lunch. My client was very understanding. I probably could have bailed. But my (foolish) pride would not let me.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What I remember was that there were two sources of grief. First there was the pdf conversion software. Second, the files I was trying to process did not have consistent formatting. Also, the files often used spaces in numbers - one million - was 1 ooo ooo. This seriously confused the pdf converter.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the end, I used a combination of python code and manual intervention. In retrospect, hiring a typist would have taken less time and been cheaper.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-3486331023581993333?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/3486331023581993333/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/03/hacking-pdf-with-python.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3486331023581993333'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/3486331023581993333'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/03/hacking-pdf-with-python.html' title='Hacking PDF with Python'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-8621564682312938268</id><published>2010-03-23T12:41:00.000-07:00</published><updated>2010-03-23T12:52:03.379-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='django'/><category scheme='http://www.blogger.com/atom/ns#' term='eclipse'/><category scheme='http://www.blogger.com/atom/ns#' term='debugging'/><title type='text'>More setting up Eclipse for Django debugging</title><content type='html'>I have been using eclipse with break points to debug a django app for a few days now. For a while the process was pretty flaky. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At first, I thought it was because I installed Aptana Web Development Tools after I had things working. The reason I installed that plugin was to get decent editors for HTML, CSS and javascript.  After that install, I get the message: "Aptana JavaScript Scripting Console Started" whenever I start eclipse. It seems that message is benign.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I was able to get the debugger to work reliably with django by:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) starting the remote debugger (as before)&lt;/div&gt;&lt;div&gt;2) using the Run icon to start manage.py instead of the Debug icon.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-8621564682312938268?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/8621564682312938268/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/03/more-setting-up-eclipse-for-django.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/8621564682312938268'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/8621564682312938268'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/03/more-setting-up-eclipse-for-django.html' title='More setting up Eclipse for Django debugging'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4821566683798191899.post-6837436788977496609</id><published>2010-03-16T13:06:00.000-07:00</published><updated>2010-03-16T15:28:30.910-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='django'/><category scheme='http://www.blogger.com/atom/ns#' term='eclipse'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='debugging'/><title type='text'>Setting up Eclipse for Python and Django</title><content type='html'>&lt;div style="text-align: left;"&gt;After spending more hours than I would like to admit, I finally have Eclipse installed, with the Pydev plugin and I have debugging working for django. It was that last part that caused all the misery.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;These notes are small additions to the &lt;a href="http://blog.vlku.com/index.php/2009/06/10/djangoeclipse-with-code-complete-screencast/"&gt;excellent online tutorial by Nick Vlku&lt;/a&gt;. You will want to watch that first. I strongly recommend watching the whole thing once without pause. If you try to do things as he does them you will encounter errors - ones that he will give the solution to a little further into the screencast. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here is my setup:&lt;/div&gt;&lt;div&gt;Windows XP&lt;/div&gt;&lt;div&gt;Django 1.1.1&lt;/div&gt;&lt;div&gt;Python 2.5&lt;/div&gt;&lt;div&gt;Eclipse 3.5.2&lt;/div&gt;&lt;div&gt;PyDev 1.5.5&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This version of Pydev includes the extensions. Although the menus might be in different locations, it's not hard to guess what the new locations are. Enter the same values, etc... as Nick does. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;After I got eclipse and pydev installed, I played around with it for a while with an ordinary python app (e.g non-django). I got used to the auto complete and debugging. Being familiar with these things definitely made it easier to setup debugging for django.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here are some tips for setting up eclipse to work with the django development server. The goal is to have the development server work the same way it does outside of eclipse, but with breakpoints.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The first tip is to watch the windows process monitor for python.exe processes. It is very easy to end up with some orphaned processes which can cause things to stop working right. When in doubt, restart eclipse. After you start eclipse, you should see one python.exe process. Starting the remote debug server does not instantly start another python process. Running your code in debug mode adds two processes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Next, Nick gives some&lt;a href="http://www.djangosnippets.org/snippets/1561/"&gt; code for your manage.py module&lt;/a&gt;. I could not get things to work using that code. I do not know if the problem was the code or something else. I eventually got things working using &lt;a href="http://rajasaur.blogspot.com/2009/12/eclipse-pydev-for-django-with.html"&gt;this code&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Next, make sure to start the Debug Server before you run your django code. When you run your django code in debug mode, you will see a message "pydev debugger: starting" - this is not the Debug Server. If you do everything right, you see three python.exe processes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here is what my debug window looks like when everything is working and the code is stopped at a breakpoint at the function homepage in the views.py module:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_nMTGTxTJQ6g/S5__1cXCrOI/AAAAAAAAAAU/3FvSQbt3caU/s1600-h/eclipse1.jpg"&gt;&lt;img src="http://1.bp.blogspot.com/_nMTGTxTJQ6g/S5__1cXCrOI/AAAAAAAAAAU/3FvSQbt3caU/s400/eclipse1.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5449355367752576226" style="cursor: pointer; width: 263px; height: 400px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;I have not figured out how to stop the debug server - outside of killing it from the process monitor. If you read the comments on Nick's post, you will see that others are wondering the same thing.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;As for other setup tips, an other blogger recommended setting the environment variables DJANGO_SETTINGS_MODULE and PYTHONPATH in the Python Run configuration. I found that made things worse. If you do not know what I am talking about, then don't worry about it.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Others have suggested adding the --noreload flag after the "runserver" command in the Run Python configuration. Evidently setting that flag causes eclipse to print more informative messages in the console (like the django development server run outside of eclipse) with the downside being that you must stop and start the server if you change any of your code. For me the most important thing to know was that django debugging can work with or without that flag.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4821566683798191899-6837436788977496609?l=chuckin-py.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chuckin-py.blogspot.com/feeds/6837436788977496609/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://chuckin-py.blogspot.com/2010/03/test.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/6837436788977496609'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4821566683798191899/posts/default/6837436788977496609'/><link rel='alternate' type='text/html' href='http://chuckin-py.blogspot.com/2010/03/test.html' title='Setting up Eclipse for Python and Django'/><author><name>Chuck</name><uri>http://www.blogger.com/profile/14869760862894848503</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='31' height='32' src='http://2.bp.blogspot.com/_nMTGTxTJQ6g/TJjETcI2nkI/AAAAAAAAAAw/oYkztOspMP0/S220/fb2.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_nMTGTxTJQ6g/S5__1cXCrOI/AAAAAAAAAAU/3FvSQbt3caU/s72-c/eclipse1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry></feed>
