Saturday, March 9, 2013

Problems with Django Testing and Selenium

As mentioned in previous posts, I am thrilled with the Django testing infrastructure. Although TestCases are a great start, I have a situation where I really would like to automatically push some buttons on my page. This seemed like a job for Selenium. The Django docs give a nice example, so I thought I would give it a try. Things did not go as expected.

This was related to the way Django (version 1.4) runs tests and the way the staticfiles app works. When tests are run, DEBUG is set by the test runner to False. No problem when running TestCases because the pages are not actually loaded. No static files are needed.

Setup a test using LiveServerTestCase, and now static files are needed. But when DEBUG=False, the staticfiles app leaves serving static files to you. If you have never run DEBUG=False in your development environment, you probably are not setup to serve the static files. Hence you get a flood of HTTP 500 errors.

The solution is on stackoverflow. Here is my code (IS_DEV is a setting indicating the code is running on the development server):

if settings.IS_DEV and ('test' in sys.argv):
    urlpatterns += patterns('',
                            url(r'^%s/(?P.*)$'%settings.STATIC_URL[1:], 'django.views.static.serve', {
                                'document_root': settings.STATIC_ROOT,
                                }),
                            )

urlpatterns += staticfiles_urlpatterns()

When I first implemented this solution, it appeared to not be working because I still got one HTTP 500 error. This was a little difficult to debug because instead of displaying a debug error page, it just generated a server error (because DEBUG=False). I was able to track down the source of that error by turning on the logger. This showed the error was due to a missing favicon.ico file.

Monday, March 4, 2013

Django Testing Examples

Django's testing functionality is awesome. But the docs are somewhat lacking. Here's a great post on stackoverflow that will get you started. Here are a few more examples:

Fixtures

These are databases that are used only for testing. They have known, fixed content that the tests can test against. The easiest way to make a fixture is to run:

python manage.py dumpdata

The problem with that approach is it will include stuff that will break the fixture. What you need are some excludes. And to make things more readable, you might want to add some indents. Here is the command I used:

python manage.py dumpdata --indent 4 -e contenttypes -e auth.Permission -e sessions -e admin.logentry > simple_auth_fixture.json

Note, this will not remove those tables from your testing database. It will just not provide initial data for those tables.

Also, I use south for database migrations. South makes database entries and .py files to track migrations. By not excluding south there was the potential for a conflict with my migration .py files. This did not happen in the tests I did.


Testing an Object Edit View

The usual flow is for the user to load the form using initial data, then edit the data and post it. Here is one way to do that:

response = self.client.get('/app/edit_obj/3/', follow=True)
# Make sure the form loads
self.assertTemplateUsed(response, 'app/edit_obj.html')
# Get the initial form data 
form_data = response.context['form'].initial 

# Now you can change some data and post it
form_data['afield']='xyz'
response = self.client.post('/app/edit_obj/3/', form_data, follow=True)
# Use asserts to handle whatever responses you expect.

Figuring Out Unexpected Errors

Lets say you are testing a view that includes a form. You post some data and get an unexpected error. How do you figure out what the error is? Here's one way:

response = self.client.post('/app/edit_obj/3/', form_data, follow=True)

# These print statements should help you figure out what went wrong 
print 'Errors:', response.context['form'].errors
print 'Non Field Errors:',response.context['form'].non_field_errors
print 'Is Valid',response.context['form'].is_valid()

A Confusing Error from assertFormError

Here is a somewhat confusing error I got when using assertFormError. The error was:

Failure
Traceback (most recent call last):
  File "/home/junk/.virtualenvs/qdb_django14/lib/python2.7/site-packages/my_proj/an_app/tests.py", line 52, in test_call_view_fails_dup
    self.assertFormError(response, 'form', 'email', u"Email address already exists.")
  File "/home/junk/.virtualenvs/qdb_django14/lib/python2.7/site-packages/django/test/testcases.py", line 718, in assertFormError
    " response" % form)
AssertionError: The form 'form' was not used to render the response

The problem was the data I was sending to the form was supposed to be invalid. But there was an error in my form validation code. The method form.is_valid() returned true which lead to an unexpected redirect. The new page did not use the context variable 'form'. It would have been really confusing if the new page also had a context variable named 'form'.

One way to prevent this is to use assertTemplateUsed to make sure you have the correct template before you look for errors.

Testing by User Type

Often it is important to restrict certain activities to certain types of users. Since I made my fixture from a working website, it already contains the various types of users I might want to run a test on. The problem with this is I did not want to put passwords in my tests.

One way to handle that is just to reset each user's password before login. I made a method for that:

class TestEditClient(TestCase):
    def login(self,username):
        # Keep real password out of tests. Sets password to test.
        self.password='test'
        self.user = User.objects.get(username__exact=username)
        self.user.set_password(self.password)
        self.user.save()        
        response = self.client.login(username=username, password=self.password)
        if not response:
            raise RuntimeError('Could not login') 

An alternative would have been to create each user type using the standard commands for creating a user. For my project, users had a bunch of other records associated with them. I did not feel like creating those, so I just used the users we already had in the db.

Sunday, March 3, 2013

Speeding Up Django on Webfaction

My Django/mod_wsgi site uses virtualenv and uses the apache directive VirtualHost. Thus the standard install files, such as httpd.conf, provided by Webfaction need modification. 

The problem I encountered was the first page hit to the server would take more than 10 seconds to load. Subsequent pages would load quickly. The Chrome auditing functionality, showed that these long load times mostly involve the browser waiting for the server after the initial http request.

Others have seen this phenomena. For me, the most useful part of this forum post was introduction to the ab tool. This tool helped me improve my httpd.conf file.

Some strangeness. The ab function continues to work and have good performance even after the server is stopped!

Chrome strangeness - I am using Version 24.0.1312.56 Ubuntu 12.04 (24.0.1312.56-0ubuntu0.12.04.1) on Linux and Version 25.0.1364.152 on Mac OS 10.7.5  I am using the developer tools on both. The network tab shows a timeline of the loading of each component of the page. I expect a bunch of my static files to be retrieved from the browser cache. Both browsers show an http status code of 200 for items that should be retrieved from cache. On the Mac the load times of those files is 0, and the page loads fast. On Linux, the page loads much slower and the load times for those components are on the order of seconds. Firefox on Ubuntu loads extremely fast and indicates it is getting those files from cache.


Saturday, March 2, 2013

Caching HTTPS Static Files in Django

This post assumes you are familiar with the django staticfiles app and are already serving your static files from a different server that's optimized for serving static files.

Caching static files in the browser is an important technique for speeding up page load times. Part of the challenge in getting this right is that many of the parts are dependent on each other.

For my needs, the webpage needed to be served using HTTPS. To avoid the dreaded "mixed content" warning messsage, this meant that all of the supporting static files would also need to be served using HTTPS.

I serve my static files from a shared server on webfaction. My choices are an nginx server where my options as to set cache expires to the max or not at all. The other choice is an Apache server with htaccess controls.

The first question was, what cache controls are needed to cache HTTPS files? Some authors claim it is necessary to set cache-control:public to get the browser to cache HTTPS. If this were the case, then I would need to go with the Apache server. Evidently, that is an old recommendation. According to stackoverflow, its no longer needed as of 2010. Maybe I can still use nginx.

If I am going to use nginx, then the problem is how to update static content when the browser cache is really long. Luckily some clever Django app developers have this all figured out. They have made it possible to have collectstatic put a hash of the file content in to the filename. Thus every time you change the content, a new file is created which will cause the browser to download it and cache it.

But wait, won't changing the filename mess up my templates? Not to worry, the app the adds the hash to the filename, keeps a mapping between the original name and the new name. When a template is  rendered, the app changes the names to the new names. For more info, check out class storage.CachedStaticFilesStorage .

Monday, February 18, 2013

Django Collectstatic Gotchas

Overall the Django staticfiles app is great. But I ran into one behavior I did not expect that caused some problems.

As noted in the documentation:
 The default is to look in all locations defined in STATICFILES_DIRS and in the 'static' directory of apps specified by the INSTALLED_APPS setting
and:
Duplicate file names are by default resolved in a similar way to how template resolution works: the file that is first found in one of the specified locations will be used.
To make my apps more modular, I put a static directory in each. Each containing among other things a css folder. Unfortunately for me, I used the same name for some css files in different apps. Thus the first one found made it into collectedstatic and the other one did not. This was not what I was expecting.

To solve the problem, I set

STATICFILES_FINDERS = ( "django.contrib.staticfiles.finders.FileSystemFinder", )

And I added my apps to STATICFILES_DIRS like this:

STATICFILES_DIRS=[('my_django',os.path.join(MY_DJANGO_PATH,'static')),
('app1',os.path.join(MY_DJANGO_PATH,'app1','static')),
('app2',os.path.join(MY_DJANGO_PATH,'app2','static')))

Now collected static made a folder for each app.


Thursday, January 24, 2013

Problems Stopping Matplotlib FuncAnimation

I am writing a variant of the matplotlib strip chart example. I am running matplotlib version 1.1 on Ubuntu 12.04 from Aptana Studio. I stop the animation using the close button on the plot window.

When I do that, I get the following error message:

invalid command name "182058716callit"
    while executing
"182058716callit"
    ("after" script)
Traceback (most recent call last):
  File "/home/xxxxx/Documents/Aptana Studio 3 Workspace/foot_force/plot_boxing.py", line 179, in 
    plt.show()
  File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 139, in show
    _show(*args, **kw)
  File "/usr/lib/pymodules/python2.7/matplotlib/backend_bases.py", line 109, in __call__
    self.mainloop()
  File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_tkagg.py", line 69, in mainloop
    Tk.mainloop()
  File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 328, in mainloop
    _default_root.tk.mainloop(n)
AttributeError: 'NoneType' object has no attribute 'tk'

Its not the end of the world. The code still runs as needed. But the error message is ugly, especially if the code is being used to demo something to a customer.


Turns out the solution is simple. FuncAnimation has a keyword call interval, which is the time between calls to your update function in milliseconds. Increase the interval and the problem goes away.

Saturday, January 19, 2013

Using Python Logging Across Multiple Modules

For some reason, I do not use the builtin logging module in Python as often as I think I should. I am not sure why. I have used logging several times before, yet it has not become part of my standard tool kit. My first suspicion is that I am just being lazy and print statements are so easy. But now as I wade back into the waters, I am starting to remember that logging uses some form of magic that I never took the time to get comfortable with.

The magic appears when you use logging across multiple modules. Suppose you have an app called app.py that loads a function from a module as in the following code:

# app.py
from my_module import my_func
my_func()

# my_module.py
import logging
logger = logging.getLogger(__name__)

def my_func():
    logger.error('message from mod 1')
    logger.debug('a debugging message')

If you run app.py you will get the error:

No handlers could be found for logger "my_module"

But add an import in app.py and suddenly the error in my_module.py goes away:

# app.py
from my_module import my_func
import logging

logging.error('A message from app.py')
my_func()

I lied, you also have to have line 5.

I cannot think of another case where an import in one module effects another module. That's why I did not like using logger. I never got comfortable with how it worked across modules. But magic is also powerful. Time to learn it. Before we go any further, you should read the basic tutorial in the Python docs.

The questions we are going to address are:
  • how does logging come to life?
  • how does it stay alive when execution passes to a new module?
The answer is that once the logger is configured, it becomes part of the Python interpreter process that is executing your code. It effectively becomes global.

Part of the confusion also comes from how easy it is to configure logging. As shown in the snippet above, simply importing logging and logging a message auto-magically configures a logger for your entire project. Convenient, but un-nerving if you do not know its happening.

Here is a simple example of a more explicitly configured logger:

# app.py
from my_module import my_func
import logging
logging.basicConfig(filename='example.log',level=logging.DEBUG)

logging.error('A message from app.py')
my_func()

The logger in the previous example wrote to the console. This logger writes to a file and changes the default level. The amazing thing here is that the log messages in mod1.py also show up in this file.

This also explains why the line:

logger = logging.getLogger(__name__)

shows up in so many modules. This line causes the logging messages to be prepended with the module name. Pretty helpful when you have 47 different modules writing to the same log file.

I think that does it for me. With these mysteries cleared up, the rest of the Python docs are pretty straight forward.

But since we have gotten this far, I want to share a debugging trick. Often when I am debugging a module, I run it as __main__ along with some setup code. Here is how to make the logger work with __main__ and when it is called as part of a larger app:

# mod1.py
import logging
logger = logging.getLogger(__name__)

def mod1():
    logger.error('message from mod 1')
    logger.debug('a debugging message')
    
if __name__=='__main__':
    logging.basicConfig(level=logging.DEBUG)
    mod1() 

This sets up a different logger than the one when the module is not __main__. In the app, you may have the level set at ERROR. But when running the module as main, you can have the level be DEBUG. Also, this debugger writes to the console, which is much more convenient than writing to a file. It essentially creates print statements that can be turned on and off.