Archive for the ‘Computer Science’ Category

linux 2.6.21-bw

Saturday, April 28th, 2007

There aren’t official reiser4 patches for .20 or .21. There are quite a few branches that contain support for reiser4, but these are highly unstable. (for instance the -mm tree). Also because I wanted to give stacked git a shot, I started my own kernel tree:

One big bzip2-ed diff: bw-for-2.6.21.diff.bz2.
bzip2-ed tar with the separate patches: bw-for-2.6.21.tar.bz2.

This release contains reiser4, suspend2, the gentoo patches and a few patches to get everything working together nicely.

Virtual package for your python application

Saturday, April 21st, 2007

When you’ve got a big python application, you’ll usually split it up in modules. One big annoyance I’ve had is that a module inside a directory cannot (easily) import a module higher up in the tree. Eg: drawers/gtk.py cannot import state/bla.py.

This is usually solved by making the application a package. This allows for import myapp.drawers.gtk from everywhere inside your application. To make it a package though, you need to add the parent directory in the sys.path list. But unfortunately this also includes all other subdirectories of the parent directory as packages.

However, when the package module (eg: myapp) was already loaded, then the path from which myapp was loaded is used to find the submodules (eg: myapp.drawers.gtk) and sys.path isn’t looked at, at all. So, here is the trick:

import sys
import os.path

p = os.path.dirname(__file__)
sys.path.append(os.path.abspath(p+"/.."))
__import__(os.path.basename(p))
sys.path.pop()

Note that this script doesn’t work when directly executed, because the __file__ attribute is only available when loaded as a module.

Save this script as loader.py in the root of your application. import loader from the main script in your app, and you’ll be able to import modules by myapp.a.module, where myapp is the root directory name of your application.

The Filesystem Failed. Part I: introduction

Saturday, March 3rd, 2007

The Filesystem (I’ll consider the Linux VFS as an example) has failed:

  • Database storage is implemented on top of the Filesystem, because the Filesystem is incapable of serving the needs of relational storage.
  • Metadata is stored inside files in many different formats which can only be guessed by clumsy ‘magic’ in the headers. This forces many media player and desktop search application to duplicate tag information in their own databases. Each of them has only limited support for each of the many different formats.
  • More and more device and service abstractions are moving from the Filesystem to seperate namespaces, because the Filesystem’s API is inadequate. Take for instance oss which used /dev/dsp, whereas alsa uses its own. Many new abstractions don’t even go near the filesystem anymore, for instance kevents, futexes, networking, dbus and hal.
  • Small files are stored in (compressed) packs and archives because the Filesystem can’t handle them. This happens with for instance your mailbox.

The problem comes down to fragmentation of data and metadata in too many namespaces because the Filesystem doesn’t seem to be an adequate one.

In a series of posts I’ll look at the possibilities to create one unified filesystem.

ati-drivers-8.33.6 for Gentoo

Friday, February 2nd, 2007

This is a slightly adjusted 25.3 ebuild that will give you the 8.33.6 ati-drivers for Gentoo. Yes, it’s dirty. They aren’t in the main tree yet because they are considered broken, although it works just fine for me.

Download: ati-drivers-overlay-8.33.6.tar.bz2

Extract them to an overlay.

Update, the 8.33.6 drivers are in the mainline tree now, so you should use those instead of mine.

(auto)mounting removable media as user

Wednesday, December 13th, 2006

I’ve always been bothered by the fact that you need to be root to mount anything (like an usb stick). It can be solved a bit by setting up udev rules and putting a specific device in /etc/fstab, but that only works for that single usb stick. Pretty annoying.

Googling only gives you stupid and silly solution (like allowing users to mount /dev/sd[a-z] — security risk).

Luckily I’ve recently been pointed to ivman, which is an automounter. It automatically mounts removable media for you in /media.

I looked at the internals of ivman, and noticed that it uses pmount, which is a wrapper around mount which allows users to mount removable media on a /media folder. Great!

Btw, you need to be in the plugdev group to use pmount.

Update It seems that gnome-mount also works fine when you’re in the plugdev group. Gnome-mount does about the same as pmount with the advantage that gnome-mount has got the nice gui integration everywhere in gnome.

Disappearing backgrounds in IE.

Friday, December 8th, 2006

In Internet Explorer in some cases backgrounds disappear seemingly randomly. The fix? Add the position: relative; style entry to the concerned div tags.

Long mySQL keys

Sunday, December 3rd, 2006

Instead of limiting your long key (for instance a path) to approximately 800 characters (on mySQL with generic UTF8), you can hash your key and store the hash as index.

The drawback is that you need to use a quite good hash function to avoid duplicates if you want to use it for a unique or primary key. These good hash functions tend to require some computing time.

md5(microtime())

Sunday, December 3rd, 2006

Don’t use md5(microtime()). You might think it’s more secure than md5(rand()), but it isn’t.

With a decent amount of tries and a method of syncing (like a clock on your website) one can predict the result of microtime() to the millisecond. This only leaves about a 1000 different possible return values for microtime() to be guessed. That isn’t safe.

Just stick with md5(rand()), and if you’re lucky and rand() is backed by /dev/random you won’t even need the md5(). In both cases it will be quite a lot more secure than using microtime().

Simple Branch Prediction Analysis

Sunday, November 19th, 2006

This paper outlines simple branch prediction analysis attack against the RSA decryption algorithm.

At the core of RSA decryption is a loop over all bits of the secret key number d. When the bit 1 there is other code executed than when the bit is 0. The CPU branches on a different bit.

A spy process can be run on the CPU which measures the branch cache of the CPU by flooding the cache with branches and measuring the time it takes. When the sequentially running secret process doing RSA decryption makes a different branch (1 instead of 0) it can be noticed in a change of execution time on the spy process’s branches.

In this way quite a lot of secret bits can be derived.

There are some clear buts:

  • You must be able to insert a spy process on the computer itself and it should know exactly when the RSA process runs.
  • To attain clear readings, there shouldn’t be other processes claiming too much CPU time.
  • The spy and CPU process should run on the same physical processor and preferably at the same time (dual core)

An easy fix would be to allocate a whole processor for the RSA decryption time, so no process can spy. Another option would be to add noise in the Branch Prediction Buffer, but that would result in a performance loss.

RTFM, where?

Friday, October 27th, 2006

Recently a buddy on msn asked me a linux question, he just started linux so he had some problems getting stuff done.

He downloaded an installer, he said, a .run, but he doesn’t know how to execute it. He tried googling for it and asking on forums, but didn’t get an answer, so he asked me.

I solved his problem, but I still wondered, where you can find that you need to put ‘./’ in front of a file in bash to execute it and where can you find that you probably need to chmod +x the file too if you downloaded it from somewhere, if you are a total newcomer to linux.

The bash tutorial would’ve probably solved it, but do you know that that thing in which you are typing actually is a separate program? Probably not.

I basically learned all this trivial stuff while following the gentoo installation manual, but I guess that’s a bit too much to ask from each new linux user. There should be a good linux introduction that explains this trivial stuff somewhere to which I can redirect new users. Anyone knows one?

Reiser4 on 2.6.18

Friday, October 27th, 2006

Try this patch:
http://vipernicus.evolution-mission.org/…

Good and bad CAPTCHA`s

Sunday, October 15th, 2006

CAPTCHA’s are images which content needs to be written into a textbox by a user to make sure it’s a human instead of some computer script. This is an example of a good CAPTCHA of yahoo:

yahoo53.jpeg

This is an example of a really bad CAPTCHA:
dotmac18.jpeg

What makes a CAPTCHA good, as in hard to solve by a computer? Lets look how a computer would solve a CAPTCHA, there basically are 3 parts:

  1. Remove rubish background.
  2. Remove rubish lines and partition the image into sections, with in each section a letter.
  3. Recognize the letter with a neural network.

Part 1 is very easy in most cases — just filter everything out that isn’t black and isn’t a glyph-ic curve. It gets a bit more difficult if the font and background colors are random, but usually it’s simple to distinguish between a glyph (small, curve-ish, solid color) and a background (solid, usually gradients). Software is way better in this step than humans.

Part 2 is the most difficult part for software. Distorting fonts isn’t that much of a problem, as long as the software can recognize seperate curve-blobs. The real problem comes in when there are red-hering-curves or when several glyphs are connected with curves like in the yahoo CAPTCHA. When the captcha uses undistorted fixed aligned fonts, it isn’t a problem even if you add glyph connecting curves like in the dotmac CAPTCHA, because you only need to add a little bit of code to recognize an authentic glyph curve (small, thin) and then you can predict the position of the other curves. Humans are better in this step than computers.

Part 3 is a bit tedious for software, but usually easier for specifically trained neural networks than for humans.

How to make a good CAPTCHA:

  • Do not add stupid background or differently coloured polygons, they won’t work at all — they will only confuse the human.
  • Do not use a fixed font, size or alignment. Rotate the font a bit, transform it a bit and, most importantly, place them unpredictably.
  • Add glyph like curves the intersect preferably only two glyphs to make them less recognizable. Take care though that you don’t make them too font like, because that’ll prevent the human from recognizing. These extra intersecting curves make CAPTCHA’s strong, because it prevents proper partitioning.
  • Don’t use strange fonts that might seem hard to see, but are easy recognizable. For instance, dotted fonts are very easy to locate when everything else are solid curves.

Update: nice blogpost on breaking captcha’s: http://www.brains-n-brawn.com/default.aspx?vDir=aicaptcha

The rm -r / typo

Thursday, July 27th, 2006

Today I accidently made a (yes, very stupid) typo in a root console:

rm -r /

I noticed the typo almost directly, but rm managed to wipe out my /bin and started removing parts of /boot. This situation wasn’t very helpful for the stability of my system, as you might understand.

For the windows user: it’s a bit like deleting half of all executables in the windows folder.

One key difference: when running linux, you can fix it easily. I booted a livecd, mounted my system, copied the /bin from a stage3 tarball to my root partition and rebooted.

And it’s working again! There were some complaints about a libproc version mismatch with the binaries, but that’ll be easily solved by a emerge -e system.

You just got to love linux. (and other nixes for that matter)

SINP: Push versus Pull

Thursday, July 27th, 2006

SINP is pull based — I give my SINP address to someone, and he will pull the information he wants from my SINP server.

Our competitor SXIP is push based. When I use my SXIP identity I push all information I want to provide to the service — there doesn’t even have to be a SXIP server (’homesite’).

Push has got certain advantages over pull:

  • Pull is complexer: you need more traffic and more complicated traffic. Push is simpler.
  • You most likely need a seperate server for pull (you need one with SINP at least), this makes you rely on your SINP server. You don’t need a real one for push.

But pull too got advantages:

  • You don’t need to actively give your information. When I’m offline someone can still pull information from my SINP identity.
  • Pull doesn’t require the actual information to go via your computer. If someone requests my creditcard number and I allow it, it won’t be redirected through the computer I’m using, which is safer.

Tilda

Saturday, July 22nd, 2006

Tilda is a drop-down terminal for linux. Press the assigned hotkey and the terminal will dropdown and gains focus, press it once again and it’ll dropout. And even better, the terminal isn’t closed, just hidden.

This is great for development. I write some code, press Ctrl+S to save, Alt+Q for the terminal and make. If there is a bug I can Alt+Q and return to my code, and if I didn’t look closely enough I can press Alt+Q again to see the output again in the terminal.

SINP Certificates and redirects

Wednesday, July 19th, 2006

Tuesday the 11th, we (Noud, Bram and I) had a meeting with some guys of the Security of Systems Group at the Radboud University. We discussed the security of the current SINP protocol. There hasn’t been a hard verdict on whether SINP is secure, because the SINP specification leaves a lot of details to implementations and SINP doesn’t make hard claims on its security yet (which can be either proved or disproved).

The meeting has yielded two new additions to SINP: document certificates1 and redirects.

First of, SINP document certificates. At the moment you can’t trust the information in a SINP document. I can forge my SINP document and claim that I’m Belgium, which I am not. To allow some trust which some people and services care about, we’ll allow certificates in your identity document. Basically you let someone sign a part of your SINP document and include that certificate.

Your goverment could sign your name in your SINP document for instance and you’d add that certificate into your document, which could be required by some services. These certificates are a bit tricky though to design, because they do need to be secure and they need to be a bit more dynamic than your usual certificate because of the way SINP documents are queried.

A second problem we encountered during the meeting was how to be able to trust your SINP server. I (and other tech savvy people) can set up their own SINP server, which we can fully trust because we set it up ourselves. Not so tech savvy people can’t — they need to rely on existing SINP servers. The problem is whether we can trust those servers with our secrets.

Cees (if I recall correctly) coined the idea that some of your secrets are already on the internet. If you’ve got a VISA creditcard number, then VISA obviously has that creditcard number, and you trust them with it. What if VISA would store the part of your SINP identity document with your creditcardnumber on its own SINP Server?

Basically I go to a big SINP provider (which I don’t trust), I create a SINP identity and put in my SINP document that you can find my creditcard number under the SINP identity bas@visa.com. This act of redirecting clients to other SINP identities is called a SINP Redirect. SINP Redirects could also proof very usefull when you change your SINP server. The only thing you’d have to do is to set up a SINP redirect in your old identity document to your new identitiy document.

Both SINP Certificates and SINP Redirects will require a lot of though to implement cleanly and simple, which is tricky.

Any thoughts would be welcome.

1: Actually, this certificates aren’t new, Bram came up with the idea quite a while ago.

Typed Yields: Non fatal exceptions

Sunday, July 16th, 2006

Wouldn’t it be nice to have:

begin {
&nsbr;LoadTheDatabase("foo.bar");
} rescue (Exception e) {
print "Fatal exception happened: ", e
} on (Warning w) {
print "Database Warning: ", w
} on (Message m) {
print "Database Message: ", m
}

The rescue (Exception e) should be familiar with everyone — something failed, maybe the database file was corrupted very bad, and raised an exception and the rescue block will be executed.

But what if the database has a small error, or something is only a little bit out of place. You wouldn’t want to just ignore it, but warn about it. Usually one would implement a ‘Logger’ class to which a function can log certain events, but that is ugly and inconvenient.

Enter non fatal exceptions. Basically there would be two ways to raise an exception, fatal like we all know it, and non fatal. When the on block for a non fatal exception has been executed, control will be returned to the function in which the raise was called.

This is done in about the same ways as a lot of languages implement yield. But this time the handling code depends on the type of the yielded object.

As far as Kaja and I concern this will be a feature of Paradox.

Thanks go to Bram for the idea.

sinp.rb

Monday, July 3rd, 2006

irb> require ’sinp’
irb> c = SINP::Client.new nil, nil, [:http]
irb> c.getPublicDocument(’Kristy@w-nz.com’).write
<requested version=’2′>
<sinp-id>
<name><nick>kristy</nick></name>
<address type=’email’>kbuiter@hotmail.com</address>
<uri>hotmail.com</uri>
</sinp-id>
</requested>

As you can see, I’ve almost finished the implementation of a Ruby SINP client — I only got to finish SINP Negotiation.

SINP

Saturday, July 1st, 2006

SINP is a protocol based on HTTP(S) and XML that provides you with an identity on the web. You register a so called SINP Identity on a SINP Server of your choice. To address a certain identity, we use an email like notation: bas@w-nz.com is the SINP Identity of the user bas on the SINP Server w-nz.com.

The first big feature of SINP is authentication. If someone claims to be bram@w-nz.com, I can check that by asking w-nz.com to check it. I’ll redirect that guy to w-nz.com to let him be checked by his proclaimed SINP Server. If he really is bram@w-nz.com, he’ll have a nice session cookie for w-nz.com and w-nz.com will check that. After that w-nz.com will redirect him back and I’ll ask w-nz.com whether he succeeded.

One major application of this authentication is that someone who posts a blog comment as noud@w-nz.com, really is/are the same guy(s) that posted before as noud@w-nz.com, for they are allowed by w-nz.com.

The second big feature of SINP is that each identity comes bundled with a XML document, which can store information about the owner like his name, email address, date of birth, etc. The SINP Server stores this document. The identity owner, the guy who owns the identity, can pick an access policy for each little bit of information in this document. You might want to share your real address only to those who you’ve explicitly allowed. Everyone can see the parts of your document you’ve allowed everyone access to. This is the one for bas@w-nz.com.

To get specific parts instead of the whole thing, or get to stuff you’ve limited access to, one needs to use SINP Negotiation. To get some specific information from noud@w-nz.com, I ask w-nz.com for this specific information, in the form of a few xpaths. Along with the xpaths I can send my own SINP address, bas@w-nz.com. The server will respond on each request according to the access policy which the SINP Server has set. There are several possibilities:

  • Ok, the requested information will be included in the response.
  • Nope, you’re denied access to that.
  • Not found, that stuff isn’t in this document.
  • You’ll have to ask Noud. Basically you’ll have to redirect Noud to the server, where he will be authenticated and after that he can decide whether to allow you access to it, and you can try again lateron.
  • If you’re bas@w-nz.com, you can see it. You’ll have to authenticate, though. This is done via sinp authentication as described before.

Another big feature of SINP is versioning, which allows caching. The version of a specific bit of information is send back on each response in negotiation. In a negotation request, I can specify the current version I already have. In case that specific part of the document hasn’t updated, the SINP server will let me know, instead of sending the whole thing.

One advantage of caching and negotation is that information can be kept synchronized with your document when it updates. A blog, on which you’ve posted a comment, might periodically check whether the information it retreived from your SINP document has changed. This can be done cheaply with negotiation and versioning.

SINP is easy to implement, it is quite simple. It also is portable, it uses widely supported technologies like XML and HTTP(S) as its base.

SINP is under development, but you can already (and really should) take a look to:

SINP is based on things I’ve seen floating around on the web, for instance Zef`s SPTP.

At the moment of writing we’re developing a PHP client, a Python client and continueing development of the PHP server. You’re welcome to participate!

We hope you like it, comments or any other forms of participation would be very welcome.

Bas, Bram and Noud.

SINP & Codeyard

Saturday, July 1st, 2006

SINP, a protocol that provides you with an identity that allows authentication and negotation of information linked with the identity as an identity document (basically the whole functionality put into a way too short sentence), which was deviced by Noud, Bram and me, has won the Capgemini Opensource Award! Lots of rejoice! You will hear more about this :-).