Slow DNS requests in Linux

So, I have finally nailed it. It seems that my last post regarding this topic was only a partial solution and in some cases didn’t solve the problem of IPv6 DNS requests at all. It seems that for example OpenSSH is compiled to try AAAA record before A as default in many distributions. The problem is that GLibC resolver is sending A and AAAA DNS request at once. And even if it gets reply to A it still waits until timeout for AAAA if you don’t have any IPv6 routers present on the network you’re connected to.

The solution is simple. Newer versions of GLibC (2.10 and later) resolver has option to try one request at a time. Edit your options in /etc/resolv.conf (or if you’re using resolvconf probably with DHCP client edit /etc/resolvconf/resolv.conf.d/tail) and add a line like this:

options single-request

And that’s all. You’re done. (In case of using resolvconf you should reload it with /etc/init.d/resolvconf reload).

Posted in Internet, Network | Leave a comment

Streaming YouTube music videos as MP3 with Python

I’ve been talking with my friend about the idea of standalone YouTube music player in Linux, since we are using YouTube to listen to music very often. There is great wealth of good quality tunes published there and easy access to them, just few clicks away makes YouTube a nice candidate for streaming music. My friend suggested that there should be a plugin for Amarok or any other player in Linux which allows you to enqueue some tracks from YouTube and let you play the music, without the burden of running browser.

I said that’s really not a problem since we are Python programmers and quickly thought I might try to quickly hack a proof of concept. After just under an hour, I’ve come up with a quick, dirty hack that allows you to stream music videos from YouTube as MP3.

So what I’ve done is write a really simple and basic script that would conform to the Unix philosophy – one program, one function – under Python of course. The script takes an argument with YouTube video ID and then streams it back to standard out (stdout). Here’s the script:

— cut —

# -*- coding: utf-8 -*-
import urllib
import sys

class AppURLopener(urllib.FancyURLopener):
  version = "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101230 Mandriva Linux/1.9.2.13-0.2mdv2010.2 (2010.2) Firefox/3.6.13"

if len(sys.argv) < 2:
  print 'Usage: {prog} URL'.format(prog = sys.argv[0])
  sys.exit(1)

video_id = sys.argv[1]

opener = AppURLopener()
fp = opener.open('http://www.youtube.com/get_video_info?video_id={vid}'.format(vid = video_id))
data = fp.read()
fp.close()

if data.startswith('status=fail'):
  print 'Error: Video not found!'
  sys.exit(2)

vid_list = []
tmp_list = urllib.unquote(urllib.unquote(data)).split('|')
for fmt_chk in tmp_list:
  if len(fmt_chk) == 0:
    continue
  if not fmt_chk.startswith('http://'):
    continue
  vid_list.append(fmt_chk)

# FIXME: Format choice
link = vid_list[0]

fp = opener.open(link)
data = fp.read(1024)
while data:
  sys.stdout.write(data)
  data = fp.read(1024)
fp.close()

— cut —

This script streams the raw video to stdout. It doesn’t have any error corrections and is really simple. If YouTube decides to change the description file it won’t work anymore. Nevertheless you can stream your video to stdout and under Linux you can take care of that stream with available tools.

So I have used FFMPEG transcoder that would rip out on the fly the audio stream, transcode it to MP3 that you can write or listen to if you want.

How can you do this? Pretty simple, given you’ve got necessary decoders installed (mainly FAAC, FAAD and others used in conjunction with libavcodec). If you want to listen to mp3 stream here’s how you do it:

python script.py YOUTUBE_VIDEO_ID | ffmpeg -i - -ab 192k -vn -acodec libmp3lame -f mp3 - | mpg123 -

Where YOUTUBE_VIDEO_ID is the identificator of YouTube video. What FFMPEG options mean:

  • “-i -” – that means the input is stdin (standard input)
  • “-ab 192k” – means you want to transcode to 192kbit/s mp3 stream
  • “-vn” – you don’t want to stream any video
  • “-acodec libmp3lame” – use LAME MP3 encoder for the trancoding process
  • “-f mp3” – output format should be MP3
  • “-” – at the end, means that output is to stdout (standard out)

Due to pipelining you use three programs which make all of this possible. First the streaming script I have written, then the transcoder (ffmpeg) to mp3 stream, then the player (in this case mpg123). If you want to download YouTube video music and save it as MP3 file you do:

python script.py YOUTUBE_VIDEO_ID > stream.flv; ffmpeg -i stream.flv -ab 192k -vn -acodec libmp3lame -f mp3 OUTFILE.MP3

Where you have to substitute YOUTUBE_VIDEO_ID with YouTube video identificator and OUTFILE.MP3 with name of your new MP3 file.

That’s all! Hope you find it useful.

Posted in Coding, Linux, Programming, Python | Tagged , , , , | 5 Comments

Slow connection requests under Linux

Hello again. Few posts before I have talked about slow web requests in Firefox running in Linux, namely Mandriva 2010.1. Just to remind what the issue was in short. Firefox when  making connection, staled for few seconds when performing DNS request. The issue turned out to be timeout related, because Firefox tried to resolve domain name with IPV6 first. However when there are no IPV6 routers present in your network, FF would wait until name resolution with IPV6 timeouts before sending resolver packets with IPV4. This was very irritating, but could be resolved with the method I have shown in that previous post.

However after some time I’ve discovered that this issue seems to be not Firefox fault. This has greater extent that can be traced to a possible bug in global name resolver library residing in the system. I’m not sure if this issue occurs in Linux distributions other than Mandriva 2010.1. It’s possible. It seems that each and every application that uses system name resolving library has the same problem with AAAA DNS record request timeout when there are no IPV6 routers present in your network. This is very irritating, because each and every program that needs name resolution is painfully slow when trying to make connection.

There’s a rather simple solution to get rid of the problem entirely. You have to disable IPV6 support in your system entirely. I know this is not the perfect solution, but it works. This is how you can do it.

Edit /etc/modprobe.conf and put in those lines:


alias net-pf-10 off
alias ipv6 off
install ipv6 /bin/true

And that’s all. You should restart your system and problems with slow connection requests are gone. Mind you you’re disabling IPV6 support entirely, so if you’d need it again, you’d have to comment those lines put into /etc/modprobe.conf.

I’m not entirely sure if this is a bug or the architectural design problem of the resolver library in Linux. It’s extremely irritating nevertheless.

Posted in Internet, Linux, Network | Tagged , , , , , , , | 1 Comment

SQLAlchemy QueuePool exhaustion problem

Welcome back. Today I wanted to share with you some thoughts about the problem with SQLAlchemy used in conjunction with Pylons framework based application that has been getting on my nerves for some time. However, before I’ll talk about it I want to lay the basic foundation for further elaboration on this problem.

My company has been developing pretty advanced Content Management System for almost two years now. This CMS is based on Pylons framework which is probably the most flexible stable Python web framework existing currently. Of course Pylons developers aren’t sleeping and their Pyramid framework, which is loosely based on the experiences from Pylons project and ideas from various other micro-frameworks, is probably the next big thing happening in Python web development world. Everybody’s jumping around and praising the new Pyramid. I’ve even heard that some other frameworks are willing to rewrite their cores using Pyramid. What will happen we’ll see in the near future. For me personally it’s good to hear that, but since Pyramid is currently under development, I’m still willing to use Pylons for some time, before Pyramid foundations stabilize. After all Pylons is and will be supported by Pylons developers indefinitely.

So. Let me speak about this CMS system we’ve been developing. As I’ve said it’s based on Pylons web framework. The framework itself was heavily extended and rewritten in some areas to better suit our needs. From the start we’ve been aiming at the modular architecture. We’re using SQLElixir on top of heavily acclaimed SQLAlchemy ORM for our database model and operations. Of course there are many advanced mechanisms and libraries that have been developed and used for this CMS. But I’m not going to talk about it for now. We had few successful client website deployments using this product.

As you probably may know Pylons is based upon Paster application server, but it’s usage is not recommended for production environments. It’s mostly intended to be used as development server. This doesn’t matter however, because any application server can be used for Pylons based application. This is the primary component of the Pylons based application – the application server. There’s also another component which is the database itself, but Pylons application can be written without any database. Our CMS however is using PostgreSQL for it’s database backend.

The application server itself is the core of the system, but there is also another component that is required for proper functioning of the CMS, which is neither standard nor included in the Pylons framework. We have developed this component in house and that is a supporting server – Scheduling Server.

This is very important part of the system, despite the fact that application server can be run without it. Scheduling Server is a secondary, independent daemon process which is used in conjunction with some advanced administrative panel capabilities of the primary application. It’s main purpose is to offload resource intensive and/or time consuming tasks that can be performed with some delay and don’t require immediate user interaction. It can be also used for periodic tasks and checks on the primary application server itself.

So. The Scheduling Server is basically a Python based Cron daemon for our CMS and on the other hand the biggest source of problems for us. Despite it’s rather simple architecture it has seen numerous rewrites for the last fifteen months or so. Why? Let me explain a little bit how this supporting daemon works and what it does for the system.

Scheduling daemon is relatively simple server with primary loop that periodically checks for new tasks. It launches checks for new tasks (jobs) with the interval of few seconds. Each check is launched in a different thread, so multiple of them can be done at the same time. It mostly interacts with database through the database model developed for primary application. The model is shared between the two, so you can say it’s being reused within the Scheduler. Before scheduling server enters primary loop it loads pylons configuration and environment by the means of pylons.config and pylons.load_environment functions. Those functions use the same .ini file that’s used for application server configuration before it’s started. So basically it also reuses the configuration of primary application. After configuration, the database model is initialized, the SQLElixir/SA engine is created, connection to database is established and you can use the model the same way you use it within the primary app. After all these things the primary loop starts.

Scheduler is used mostly for performing checks on new translations using i18n facilities within Pylons, making database backup dumps and restoration, email sending, queuing tasks, external services synchronization among other things. Some of those tasks like compiling new i18n translation files or restoring database dumps requires the Scheduler to restart the primary server and ensure it is working after the restart. During most of these operations the Scheduler performs database interactions with the Tasks table.

And here comes the problem. For most of the time, the Scheduler after launch is performing it’s tasks without any problems. However the longer it was running, the less tasks were performed, up to the point where it seemed that it’s not doing anything, despite the primary loop was functioning o.k. As I have said, checks are performed in separate threads, so it’s quite difficult to debug what’s going on, despite having extensive log facilities in place.

For most of the time there was no tracebacks, no STDERR or STDOUT information within the log files and since the server is daemonic we couldn’t easily debug. The problem was really elusive for a long time, because Scheduling server seemed to work fine after restarting it and we couldn’t find any errors whatsoever. We’ve implemented interactive mode just to debug, without any STDERR or STDOUT redirects, so we would immediately see what’s going on. We’ve reimplemented the server more than six times, three times using our in-house built solutions, three others using available scheduling libraries. Without any success. The problem disappeared for some time, but after clients started using either translation facilities or backup facilities the problem appeared after some time. We couldn’t exactly trace the origins of the problem and we didn’t have any clue as to what may cause such an odd behavior.

Few times I have tried to comment out portions of the code and run tasks, without any threads whatsoever, only sequentially directly one after another in the primary loop, because I had suspected that there’s a strange bug in the tasks code themselves. Even when running for few hours and testing it this way no problem appeared at all. No single error. No traceback. Not even a single warning. Everything was working as it was intended to. I was clueless. Until today.

Once again I wanted to attack this problem with a fresh perspective and open mind. Up until today it was irritating, but we got around it using very nasty hack, which restarted the Scheduler after some time. However I didn’t want it that way, so I’ve wasted whole day for testing, looking for the bug, testing even more, rewriting, trying, getting nervous, breaking the code to pieces, analyzing, testing, uploading to server, retesting, searching at the other place, spitting curses, getting mad, debugging and everything again ad infinitum.

When I got to the problem I thought after some time that the problem lies within the threads. Or at least with the task code behavior when wrapped around in the threads. This enlightening thought came to me after I’ve rewritten most of the Scheduler core to use APScheduler library and getting rid of the daemonizing code. I’ve left purely interactive code I could more easily debug. I’ve also rewritten logging facilities which finally gave me some meaningful output. This allowed me to see that after some time, when debug running Scheduling server with it’s guts ripped out, that after doing some operations in primary application administration panel, task threads began to throw tracebacks.

The traceback stated: “TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30”.

Quickly googling for answer brought only a slight clue, that the connection pool queue is probably exhausted. Reason still unknown. So I’ve used two other previous implementations of scheduling server to see if this problem with persists or is it simply some kind of implementation caused bug. It turned out that on the other implementations the problem is still appearing. I’ve tried to fiddle with sqlalchemy.pool_timeout parameter, but to no avail. After spending few hours trying to resolve the issue I thought that explicitly closing or removing model.session would help. Still no luck.

So I’ve used poor man’s debug, namely “print variable” and “print dir(variable)” statements to see if I can learn something from it. There was still no clue whatsoever for the reason the bug might happen. I’ve turned out to review the system’s model initialization functions and tried to learn where the connection pool might reside, checking for function explanations in SQLAlchemy documentation. I’ve deduced, that when I close the session inside the thread nothing happens. But when I get rid of database engine variables the next task that is ran by Scheduler is throwing exceptions about not being able to use the connection to database. Step by step I’ve thought that it must be something between the model’s engine and session instances. I knew it was the connection problem, but since I was using SQLElixir, there was no simple way to get into connection pool debugging or at least I didn’t know how to do it easily.

I thought I’d give a model.engine functions a try and see if I can find anything relevant to the connection closing. After looking at the documentation immediately one function grabbed my attention: model.engine.dispose(). I reasoned that if the problem is related to threads, the connection pool must be somehow left open after the thread has finished. Or maybe the task is finished, but the thread is still running, because garbage collection is not performed properly by Python and the SQLAlchemy connection QueuePool is left somewhere in the memory without cleanup.

After adding model.engine.dispose() after each executed threaded task and testing extensively, the problem went away. I’ve immediately understood what was causing the bug. It was related to… imports.

Each task thread had it’s own module with a class for this thread. Each of those modules had “from cms import model” statement. We have successfully used this statement within the primary application, but since the primary application is built on Pylons and the application is by default threaded, the problem didn’t occur, because the model was shared by Pylons among the threads and reused from memory (I know this is oversimplification). However, in our supporting Scheduling Server when scheduler launched new task, it probably didn’t cleanup model after thread termination (I’m not sure if the thread was terminated at all) and the model.engine with it’s connection pool was still active and residing in memory. Since each new thread launched a new model instance and not reusing the previous one (since it was using module and not Singleton based class), without previous model instances cleanup, the connection pool was exhausted quickly. That’s why forcing engine shutdown by the means of model.engine.dispose() cleaned up the stale connections within the thread and allowed it to finish. This is just a guess, because I didn’t have time and will to debug the cause further, since I’ve already wasted too much time on it. My bad, I know. πŸ˜‰

Anyway. This note may not be very useful to you, but at least shows that sometimes really absurd things can cause havoc and be the cause for many hours of debugging and cursing everything around. Some quite simple and (mostly) logical things may be your nemesis when programming. I’m still not very convinced about the explanation I gave to you here, but it sounds probable. The other thing may be that the SQLAlchemy or Python itself has some nasty bug that causes this problem in a specific situation. As I have mentioned. I’m not willing to investigate it further for now. Debugging multi-threaded programs is a pain in the ass and in Python it’s especially true.

That’s all for now. Thanks for reading.

Posted in Pylons, Python | Tagged , , , , , , , , , , | 3 Comments

Slow mail sending through Postfix server problem and solution

Hello. Today I wanted to tell you something about a problem I have recently encountered with Postfix mail server when sending mails through it. I have a mail server running on virtual machine on my company’s quite powerful server. It was always working great, but recently I have stumbled upon a problem with mails with attachments sent from my account transferring really slow. All other services, including mail received by IMAP service was running o.k. and with nominal transfer speeds.

At first I didn’t notice that the problem existed, because I wasn’t sending any mails with big attachments at all. However today, when I wanted to send an attachment weighting few megabytes I have seen that it’s transfer to the SMTP Postfix server is really slow. I decided to investigate the cause.

So the first thing I did was to launch the best network diagnosis tool: Wireshark of course. I have noticed that the Postfix mail server is replying with TCP error packets with “TCP Zero Window Size” set. After encountering those packets and observing that they occur repeatedly over the course of mail transfer I immediately understood that there are three possible causes:

  • Application or server buffers are full and the TCP/IP stack is signaling it cannot receive any further data for now, so it’s sending “TCP Zero Window” to postpone client sending until the buffers are emptied.
  • Server is out of network sockets, which would mean that mail server is under heavy load, signaling probable Denial Of Service attack.
  • Or the application – in this case the Postfix mail server – is running very slow for some reason.

After logging into mail server I have checked “htop” for signs of overload. Situation seemed normal as the server had 2% CPU usage, and less than 1% memory usage. I have assumed that since physical loads on the server were negligible this must not be the cause. However, I still haven’t rooted out the I/O load on the server, but left it for checking later. Checking current physical loads confirmed that buffers are probably not full, but I couldn’t dismiss it for now.

The next step was to ensure that the mail server wasn’t under DOS attack. Quickly checking “netstat” confirmed that indeed there are too few connections to the SMTP service so the network socket pool would be exhausted by the system. Especially when taking into account that IMAP service was working nominally.

So the last thing to check was the Postfix mail server application itself. Since “htop” didn’t tell me it’s overloaded I have began checking if the mail queues are not overloaded. I have issued “mailq” and then “qshape.pl” commands, which told me that mail queues are almost empty. So they were not the cause of the problem.

There was one last thing to check which was the I/O load on the Postfix application. I have issued “iotop” command and observed it for a while. It seemed that one of the Postfix child processes was generating unusually high amount of I/O writes to disk. So I have issued “tail -f /var/log/mail.info” to watch for any errors and immediately I have found the cause for this problem. Postfix was writing heavy amounts of debugging data to the mail log.

Now I knew that I must have screwed the configuration of /etc/postfix/main.cf. Quickly browsing through the configuration file told me that I had set “debug_peer_level = 3” and “debug_peer_list” option value had been set to my home computer IP address. Setting debug_peer_level back to 2, commenting out “debug_peer_list” and restarting Postfix resolved the problem.

It seemed that I have forgotten to remove debugging options within the Postfix server configuration when I had to debug some other issue with mail server previously. πŸ™‚

Anyway. I hope this little tutorial may some day help someone. ’till the next time folks!

Posted in Linux, Network, Postfix | Leave a comment

Avoiding tracking and profiling on the Web

Today I wanted to share something about a topic that is really important for people concerned about privacy just like me. As many of you may certainly know, the Internet is crawling with advertisements. Advertising companies are using many advanced methods to identify and track user presence on many websites, creating detailed dossiers about each and every single person using the Web. Leader in gathering sensitive private data is Google. Every time you visit a website you’re leaving a footprint that can be used to identify your interests, likes and dislikes, your preferences and opinions, so the advertisers can target you more specifically. This is the way they choose what they call “relevant ads” that is presented to you.

Despite the fact that it can have some positive outcomes for you (what certainly means more money for the advertiser), there are many negative sides to it, most notable being invasion of privacy. Nobody can guarantee that the data gathered by advertising companies would stay safe in their data banks. Identity theft from such data bank can have far more reaching consequences than ordinary person can ever imagine. This data collected on you can be used by criminals, your enemies and pranksters to gain significant knowledge about you. Since I don’t have much time to dig deeper into topic, I’m going to show you some techniques that can be used to avoid tracking on the Web to some extent. This is by no means complete tutorial for avoiding tracking at all.

First things first. I’m using Firefox for browsing the web, because this browser has plugins which are developed especially for things like tracking and profiling your presence on the network. Advertisers use so called “cookies”, which are unique identifiers downloaded by your computer each time an advertising is shown on the website you’re browsing. I must be specific here. Such cookie is saved by the browser on your computer the first time application requests download of an advertisement from specific ad-company. Whenever your browser requests download of the advertising from the ad-company, even when it is on another website, previously saved cookie is sent to the advertising company immediately identifying you. When you surf the web and your browser downloads advertisements from the ads-provider, they are also gathering data about which websites you are visiting, how often, what content are you browsing and much more. That way, even when they don’t have your sensitive private data, like for example your name, address and credit card number, etc. advertisers are building silently extensive dossier about you. This dossier is assigned unique identifier and can be then correlated with other dossiers if you are for example using other computers and so on, building even more database information about you. And if you happen to buy something from such advertisement or give out your sensitive private data – boom, they have a complete profile about you.

What can be done about this? First thing that comes to mind. Disable cookies. Unfortunately, many sites use cookies as means for storing your logins and site preferences, so this is not really very good option to do. However, Firefox has a little nifty plugin that will block most advertisements for you which is called AdBlock Plus. Most of you probably know about it and use it. Not showing ads has many positive effects. Your browsing experience is cleaner, faster, you are conserving bandwidth and without advertisements there are no cookies that can be used to track your presence on the network. AdBlock does really good job, but occasionally it will let some ads slip through it’s filter. AdBlock can be downloaded here.

Getting rid of advertisements is certainly a serious blow to the tracking abilities of advertising companies. However it is only one side of the tracking problem. There are many far more advanced techniques to gather data about you. I won’t get into it today. So let’s get to another side of the tracking and profiling problem. Search engines, especially Google. Google and other companies in the search business are building dossier about you whenever you query the search engine. By the means of your IP address, cookies, login information and more, profile is built. Search engine companies are saying this is mostly for serving you more relevant content for your searches. However what are they doing, especially Google which is primarily an advertising company (sic!), is gathering even more data so they can fill up their data about you to serve you targeted advertisement (sponsored links, AdWords, etc.).

What can be done about it? Don’t use search engines. Impossible right? Right. However there are some nifty little plugins that will fool Google. First I’d like to introduce you TrackMeNot. What it does is working in the background and randomly querying specified search providers with random phrases. This way Google would think you were searching for things you didn’t, creating a false dossier about you. You can download TrackMeNot here.

However this is not the only thing Google uses to profile you. When you use their search engine, whenever you click on the link in the result list, it is saved in their database. Google gathers data about every so called “exit link” from their search engine. This way, they can see what web pages you have visited using their search system. There is also another plugin that will disable this tracking. It is called OptimizeGoogle. It will change each link in the search results so it will point directly to the website you want to get to, eliminating any redirecting techniques used by Google that are used for tracking your behavior. Also OptimizeGoogle does much more than that, but I won’t get into it. OptimizeGoogle can be downloaded here.

There’s also one other technique except cookies that advertisers and others are using to track and monitor your activity on the Web. Whenever you load a Flash animation it can save a Flash Cookie, which is called “Local Stored Object” or LSO for short. This is even more dangerous than cookies, because it can save much more data about you than a simple identifier. Unfortunately many people are not aware about the existence of LSO’s. Again there’s a solution to this problem. It’s called BetterPrivacy and is a plugin for Firefox. BetterPrivacy will clear LSO’s after you’ve closed your browser. It can be downloaded here.

With this set of plugins you should be fairly safe from tracking and profiling on the Web. These plugins should make your surfing experience more private. Unfortunately there’s one serious blow to all those tracking avoidance techniques I have shown you. There’s a way you can still be identified with nearly a hundred percent probability. Your browser is always sending what is called “User-Agent” string to each web page it visits. This alone doesn’t give website owners or advertisers any relevant information that the user currently browsing their page is you. However. Each browser has a set plugins (which are not add-ons, as the ones I have mentioned earlier) like Java plugin or Flash plugin or Adobe Reader plugin, etc. Those plugins also have specific versions. Taking your User-Agent, list of plugins and their versions your browser has 99% probability of uniqueness. Websites can gather list of plugins and their versions exposed by your browser by the means of JavaScript. This way you can again be identified and tracked on various websites.

What can be done about it? Well. Not much really. You can resort to changing your “User-Agent” in the Preferences of Firefox. This is however an obscurity, which still doesn’t guarantee that the website won’t try to gather information about plugins installed in your browser. You can use for example NoScript add-on for Firefox to disable JavaScript on every site and then re-enable it temporarily or permanently on websites according to your needs. NoScript is very good add-on which will guard you from pop-ups, bugs in JavaScript engine of the Firefox, won’t allow any statistics gathering code and most of all would disallow websites to gather your browsers plugin list. NoScript however can be detrimental to user experience and I’d not recommend it for anyone, but advanced users. I’m using it all the time and I have to say it is single best add-on for FF except for AdBlock. You can download NoScript here.

For more advanced users still concerned about their privacy I’d recommend installing a third party tool like privacy enforcing proxy server like e.g. Privoxy. It is a content filtering proxy server that is installed on your computer which has many advanced capabilities I won’t get into. If you’re interested in giving it a try download it here. Of course you would have to configure it according to your needs and then install Firefox add-on like FoxyProxy configuring this add-on to use your Privoxy as it’s primary HTTP proxy server. FoxyProxy can be downloaded here. Of course you can setup your proxy server directly in the network preferences of your Firefox.

Using these add-ons and techniques I’ve just presented to you should give a level of privacy for your browsing experience. This is by no means complete list of tracking avoidance techniques nor it was intended to be. Remember that the best way to ensure your privacy on the network is to use your head. Also remember that privacy is not anonymity. Putting into practice these advices can give you a slight anonymity, but taking into account each computer is identified for example by IP address and many other things, this won’t stop anyone who has the knowledge to identify you. Those techniques are merely intended to avoid greedy advertising companies building dossier about you, not to hide your presence on the Web. Ensuring anonymity is another pair of boots that I may write something about in the future.

Posted in Internet, Privacy, Web | Tagged , , , , , , , , , | Leave a comment

Python RSVG not found under Mandriva

Just another quick comment with solution to problems with rendering SVG vector images under Python in Mandriva 2010.1. I was testing Python Cairo (PyCairo) and Python wrapper around RSVG library, trying to render SVG image using PyGame surface. It seemed that after installing python-cairo development libraries, there’s no PyRSVG. I’ve spent much time looking for it until finally I’ve realized that the wrapper is packaged in “gnome-python-desktop” package.

So if you want to use Cairo and RSVG for Python in Mandriva 2010.1 simply run under root:

# urpmi gnome-python-desktop

And you’re ready to rock.

Posted in Linux, Mandriva, Programming, Python | Tagged , , , , , , | 1 Comment

Slow web requests in Firefox under Linux

For some time now I’ve been wondering why each and every request to websites in Firefox is so slow. Pages opened with few seconds delay. I didn’t have time to check thoroughly what causes web page open requests from Firefox working under Linux taking many seconds to complete. Despite the fact my network connection has rather low round-trip delay times.

This bothered me for some time and became more and more irritating as time went by. I’ve found out that this problem doesn’t apply only to one computer, but others using Linux and Firefox also. Every other program and application worked fine. I’ve observed that the problem is not with the download itself, but rather when name resolve occurs. I’ve tested domain name servers I’ve been using for resolving and they where responding to dig command rapidly. So. It was a little mystery to me.

Today, I’ve decided I’d definitely get rid of the problem. So I’ve launched my favorite diagnostic tool – the great WireShark – and went to see what’s going on. At first I’ve observed that when request goes out from the Firefox to the DNS, it seems that two queries are sent: IN A and IN AAAA. Only one answer comes back (IN A). Then after five seconds, again IN A request is sent and as soon as answer arrives the download instantly kicks in. So I knew it was the issue with unnecessary doubled DNS requests and some timeout within the Firefox.

It turns out that Firefox will try to use IPv6 when it’s available. However when you’re on a network that doesn’t use IPv6, even when your Linux box supports it by default, Firefox will try to request IPv6 addresses for each domain. Somehow DNS servers doesn’t send SERVFAIL for AAAA record on my network, causing Firefox to wait until it timeouts (five second delay) and then repeating IN A record DNS request. I’m not really sure if it’s a bug in FF or not. And honestly I don’t care. There’s however a method that will bring back your FF to speed.

In Firefox URL bar type: “about:config”. When it says something about “it’s dangerous, blah blah blah”, click “I promise I’ll be careful” and then type in within the search bar: “network.dns.disableIPv6”. Select it, right-click and then click “Toggle”. When the value of the option changes to “True”, you’re done. Restart Firefox and your long DNS requests are gone.

Posted in Linux, Network | Tagged , , , , | 4 Comments

Grown up boy with his toys

I’ve just finished my work for today. For the first time since I can’t remember when it was truly productive and inspirational day. I had time for research, time for administrative work and fiddling with my hardware. But let’s start from the beginning.

Since I’ve done Novell certification and got my first client for setting up Novell services on their servers, I did a fast-lane getting more knowledge about their services. Currently I’m pretty familiarized with most of the Open Enterprise Server 2 services. Being a Linux specialist for a long time gave me a head start, but I have to say that Novell services are NOT like other Linux services.

It’s quite easy for me to learn new things in IT, especially when it comes to networks and Linux servers, because technical details of computer networks, data-links, telecommunication technologies, protocols, packets and all this black magic was always somehow giving me thrills. I love multi-layered, complex and interdependent systems. I feel like a fish in a vastness of the ocean when it comes to that. Networked computer and electronic systems are my passion. They are like complicated Lego blocks, partially a mystery and partially a riddle that I have to get knowledge of and solve them.

And that’s it. In the last month I have learned much. There’s a topic I’d like to bring up to light. Linux Virtual Machine Hypervisors. As much I’d like to say that all or at least some of them are doing their job good, unfortunately I must confess, there are some problems with each of them. My adventure with hypervisors started with Linux-Vserver few years ago. I can’t say much about it, because when I tried to use it on my Mandriva based systems there where many kernel problems, so after few days of trying I gave up. Perhaps currently Vserver is ok, but I’m not going to give it a go, especially when there are better virtualisation solutions. After Vserver I went on to look for good emulation/virtualisation software and there where few which were better/worse. One worth mentioning was QEmu, because it at least worked, but I finally stuck with Sun’s VirtualBox. That was really polished packed back then, compared to other hypervisors. And it was running smoothly on Mandriva with pretty obsolete hardware. So for some time I used it.

VirtualBox is great, but it’s really more of a desktop, than server solution. It’s good for running Windows and some Linux distros, but I couldn’t run Novell SLED/SLES in it. I’m still using it sometimes on my hardware, but rarely. When it comes to server solutions I have to say, I’ve made one of the biggest mistakes in my career, when I selected VMWare Server for my primary internet server. This software is so buggy I wouldn’t recommend ANYONE to use it in production. Especially on busy I/O servers. I’m running few VM’s on my server hypervised by this piece of shit and I have to honestly say, that I failed at choosing it.

The server is an Intel Xeon Quad based platform with LSI SAS RAID controller, pretty good amount of memory and processing power. Unfortunately the RAID controller is heavily used. There are many fast small I/O operations and this is what causes a pain in the ass for me. There’s a combination of three factors which sports disastrous results in the VM’s. First, the awesomely shitty written VMWare Server. Second is a combination of the bug in S.M.A.R.T. and generally guest Linux VM domains kernel when hypervised by VMWare Server. Bug still unsolved, despite the fact it was supposedly resolved. The third is a problematic LSI SAS RAID controller which when combined with two previous factors is a fatal combination for guest OS’es. Why? When some of the VM’s are on their limits of I/O bound throughput and a spurious IRQ from the controller is received by the host kernel, VMWare Server causes a Busy I/O within the virtual machine. The default behaviour for Linux kernel when receiving Busy I/O is to lock down the partition and set it to read-only mode. You can only imagine what hell it causes.

Of course the problem theoretically can be avoided when setting default guest Linux kernel behavior to ignore Busy I/O received, but it may cause data corruption (this fail-safe is there for a reason). If you’re adventurous as I am (and like to do emergency repair run to data center at night) you can set this behavior. But still it’s of no use. Why? Because of VMWare Server. When the hypervised VM receives Busy I/O the whole VMDK file (containing the guest partitions) is locked down by the hypervisor and it is set to read-only. Only reboot of the VM helps then and honestly. Imagine rebooting a critical virtualised server in the middle of the day. Sounds sweet, right? As a bonus there were few times when the partitions went boo-boo and no automatic restart was possible, because you had to manually run fsck, then reboot again. Add to this at least half an hour of partition scanning by fsck in the middle of the day and you could say Inferno breaks loose at your doorstep.

But if you still think you can do as BOFH masochist imagine this. If you like to at least fake that you’re security conscious, you’re running your server management services through encrypted and secure channels like SSH, tunneling ports for those services. VMWare Server has an SSL encrypted console which allows you to manage your hypervisor and VM’s. Written as slow web app with combination of console plugins for your browser. Who the fuck invented this? But it’s not the end. To use console you have to expose all three management ports including a port from root-only allowed range. So. If you’re doing reverse tunnel from a remote location, you actually HAVE TO expose root login to your management station by the means of SSH. Otherwise reverse tunneling of root-service port won’t be established. Just imagine what piece of crap the VMWare Server is.

If you ever considered using VMWare Server, especially on Intel S5000PAL with LSI SAS Controller. DON’T!

Being fed up with this hypervisor I started to do some research. I’ve done test-runs of VMWare Workstation and I have to admit. Despite similarities with VMWare Server it seems okay. Nevertheless it’s not server solution, but desktop one instead and second thing is that it’s not free.

So I went to check VMWare ESXi 4. This is a barebone (Type I) hypervisor, unlike the others previously mentioned, which are os-managed hypervisors. I have to say that I was positively surprised. It runs smooth, the installation and configuration is simple. There’s only one thing. Your hardware must be on the compatibility list or else you’re not going anywhere. And the second con is that to manage your VM’s you have to use something called vSphere Client, which runs only in Windows. For me, avoiding Windows as much as possible, this is quite unacceptable that I can’t use my Linux desktop or laptop to manage the hypervisor running on de facto Linux, which in turn manages Linux guest VM’s. Fortunatelly there’s a VMWare Workstation which I can install Windows on and then use vSphere from it. But it’s unnecessary and unproductive burden. If I can’t find better solution I’d probably migrate all my VM servers to ESXi, but it’s a tedious task. Why? Because ESXi has a special partition for VM data called VMFS, which cannot be mounted by anything else than ESXi. Hopefully I don’t have to comment further on what it means, especially when you don’t have another spare machine exactly like this.

So. I didn’t expect this post would be primarily about VM Hypervisors, but there’s one mainstream Linux hypervisor I didn’t mention. Yep. It’s Xen (yes, I know there is also KVM, but honestly, who uses it?). I’m currently in progress of installing few servers with Novell services for my client. They decided to go with Xen. I didn’t have previous experience with this system, so I had to quickly get familiar with it. I must say that on SLES11 it’s quite stable and easy to use. But… (There always has to be one, right?) Who the hell writes VM management software in Python? I know, we all love this language, it’s beautiful and I worship it every day. But it’s too slow for things like managing VM’s! Ok. Maybe I’m little exaggerating here, but honestly. It’s slow. And the Virt-Manager is very buggy. For example. I have installed the guest OS and then removed the DVD from the drive. And guess what? The machine would start (after few tries), but it won’t get into console throwing errors about vbd not found, blah blah blah. And guest what. Restarting the system won’t help. Editing XML and VM configuration files won’t do either. It won’t allow you to edit hardware settings for the machine, throwing this stupid error over and over again. After two days I finally managed to get into console by hand editing files in /var/lib/xen. No mention of such things in documentation. No way.

But it’s not the end. The management console is broken in other parts too. For example, after installing second server, rebooting the whole system virt-manager simply hangs when clicking to run the second VM. I understand that it’s not the core software, but hey. It’s supposed to be production ready Hypervisor (tools included) right? I say it’s NOT. Nevertheless I have to admit that when the VM’s are up and running Xen is quite stable and amazingly fast in paravirtualized mode. I think that closes the topic.

I thought about writing something here about some interesting things I have also done today, but given current time and volume of this post I’ll only give a quick overview.

So. I thought today again with this VMWare Server on my hardware, because I had to prepare some virtual environments for the Pylons application server for the web application I’ve developed. Imagine that 10GHz of computing power was all used for unknown reasons (probably no DNS connectivity as usual and read-only partition on logging server), but I didn’t have time to make thorough checks. VMWare causes some kind of race condition which eats all of CPU resources. Another point to finally kill it. Anyway. When my pip install run installing all the necessary dependencies for the app server, I have done a little research into a topic of filesystem synchronization.

Recently me and my friend installed iFolder on our development and test Open Enterprise Server preparing for iFolder installation on our client’s servers. I must admit that it’s quite useful and certainly working well. We’ve also tested Novell iFolder client on SLES and Windows. Very nice file synchronization software indeed. I’m using Mandriva on my laptop and unfortunately iFolder client is not supported on any other distro except SLES/SLED. But I’ve managed to install and run this client software successfully under Mandriva. If I find some time I’ll write a little how-to about this topic. It needs some fiddling, but it can be done.

For a long time now I’ve been planning to make myself useful solution for synchronizing files between my machines. I’ve looked into various open source solutions, but there was always something what didn’t suit me. Because of this I have started a project, first of the building blocks for such synchronization system, namely the abstract client/server network library codenamed PyComm. Unfortunately I didn’t have time to finish it, but I’m slowly getting back to it after I do more important things. PyComm is a hybrid solution, rather simple in design, but modular, extensible, robust and secure. I have taken a hybrid cpu-/io-bound approach where primary data loop processing is cpu-bound and connection data loop is I/O-bound. Honestly, I haven’t tested it yet and no profiling were done, so I don’t have any real data to compare with standard approaches to building server architecture. In theory threading the IO operations and processing the primary loop should allow for fine-grained tuning of delays and data queues. In theory it should also prevent or at least limit resource usage when under heavy load or DDoS attack by the means of carefully adjusting cpu-to-io ratio. But this still has to be proven if it’s effective approach.

On the other hand PyComm defines a custom binary protocol, which can use TCP and UDP (depending on situation) as it’s transport. It encourages the use of data encapsulation which is somewhat similar to SSH channels. The data stream is multiplexed and sent as custom frames, each with it’s own header, payload and terminator (footer) sequence. This allows dynamic content transfer unit adjustment for each frame and (at least in theory) better guaranteed delivery on high packet-loss links such as WiFi, where TCP does not (it should, but due to the nature of radio data-link layer it is not guaranteed – correct me if I’m wrong). There are more things I want to test in PyComm, but it’s not the scope of this post.

So. The PyComm is a building block for file synchronization solution I’m planning to get my hands on as soon as time allows. I still haven’t decided on the architectural approach and design for it. Currently I’m researching this topic. For Linux/Unix/MacOS X one of the underlying workhorses would be FUSE. Windows is another pair of socks. And because of all those things I have mentioned earlier I have started to look for information about Windows filesystem drivers. I’m not very familiar with low-level guts of Redmond’s hell. IFS seems like hell and it’s very costly, so naturally it goes to trash from the start. But I have stumbled upon some nifty little project that goes in the footsteps of FUSE. It’s named Doken and it aims to be FUSE for Windows. I have tested it superficially and I must say it looks promising. Except the fact I’d have to code myself a Python wrapper around the library, because it’s written in C. Hopefully PyComm would run on Windows without major architectural changes.

The goal with this synchronization solution is to provide a fine-grained security and replication scheme across multiple machines on multiple platforms and architectures (for the start Linux/Windows) without the need for domain controllers available within the network. I’m not rejecting some LDAP connectivity in the future, but for now the solution must achieve three goals: be robust, be secure, be easy-to-use. After all it’s just a proof of concept and a great testbed for PyComm. Why reinvent the wheel? We have so many synchronization solutions out there, SSH, NFS and rsync comes to mind? NFS has it’s problems. Everybody knows that. It’s not truly intended for synchronization, it’s not very multiplatform and not very secure. SSH is too slow most of the time and not meant for synchronization. Rsync is not secure (of course it can be when tunneled through SSH, which is slow). iFolder on the other hand is not free and it doesn’t allow fine-grained security/replication schemes, is not supported on many Linux distributions, it needs eDirectory to run and it’s written in Mono which I personally reject on the grounds of “too much overhead” – Java anyone?

Anyway. When I was looking for information about Windows filesystem drivers I’ve came across another interesting thing called MANET – Mobile Ad-hoc Networks. It always amazed me, why the hell mobile devices like smartphones, laptops, netbooks, tablets and all that crap which is loaded with communication protocols like WiFi or Bluetooth, are not IP carriers/routers among themselves. Usage of TCP/IP stack should be obligatory when interconnecting those devices. Instead they use some stupid, inflexible and unnecessary protocols. It’s a natural way to go towards decentralized, self-organizing mobile ad-hoc mesh networks. I envision marginalization of traditional network connections within the next twenty years and a shift to global decentralized mesh network. It’s obvious, it’s economic, it’s broadening the coverage of IP connectivity among people and across the globe and perhaps in space. The “802.11s” standard is currently drafted, so there’s much room for improvement. And the seeding phase of global mesh network is going to be very interesting one indeed.

The advent of global decentralized mesh network is a major blow for greedy governments and corporations wanting to control the information, which is liquid. It must flow (like spice πŸ˜‰ ). The ad-hoc mesh network is uncontrollable, it doesn’t fall into censorship (or at least poses a major technological challenge for such task), it follows the pattern of minimum energy path function and is viable economically. It is also furthering IP connectivity and redundancy, so most of the time you’d stay connected wherever you are.

On the other hand global dynamic mesh network is a serious invasion of privacy (to some degree, because too much information causes it’s annihilation) and there may be individual restrictions in force (if not imposed by appliance makers) which won’t allow the mesh network to function properly or at least to it’s fully possible extent. There are also serious concerns for security, because global mesh network may (or may not, depending on the architecture) endanger security, ease man-in-the-middle, identity theft and other attacks, allowing malicious hackers to gain access to mobile devices and sensitive private data.

Nevertheless MANET’s are the future. And when I finally receive my HTC HD 2 (should arrive within few days, maybe tomorrow) I’m going to setup a small MANET for myself to test it and get hands-on knowledge about it’s routing systems and other such things. I’ll try to post something about this small research after I’m done.

I was supposed to end this post many paragraphs before, but… Networks are so exciting. πŸ™‚

Later all.

Posted in Random | Leave a comment

Back To The Future… ;)

Posted in Random | Leave a comment