Archive for category cool project ideas

Task-focused desktop environment

The idea of Task-focused environments is that your attention is concentrated on the current task you want to do: The computer presents you the tools you need, in the way you need them, so you can work efficiently, while hiding or graying out the unrelated things that might distract you.

A very useful implementation for software development IDEs was made with Mylyn. Now can we have the same for the whole desktop? Tasktop implements it, but in my opinion they make the mistake of trying to replace, reimplement and reinvent the existing desktop. The desktop should be preserved, as that is what the user is used to, much of the development has gone into it, and its features are not replaceable. Also, the user will want to be able to turn the thing off.

I can see a desktop app that allows you to associate applications and folders to certain tasks, e.g. “email&im”, “draw”, “write”. When in a certain task, it hides the unrelated windows and sets the bookmarks / favorite folders.

This can be done for existing desktops, and perhaps even cross-platform (Windows, KDE, GNOME): A Java application with different backends that handle window placement, favorite places, etc.

No Comments

Command recognition

Have you noticed something about speech recognition? It is not here. It is not on my laptop and not on yours (probably). I hear it works somewhat well with Dragon NaturalSpeaking and in some games, but it requires training.

Worse, commanding your computer to do something is not here. Vista has something, MacOS had something for a while. I doubt they work.

I would like a free software that works, does not require speaker databases and can command your computer. That is, you train words with command classes, and it can execute these. Shouldn’t be too complicated, should it? Saying “next” or “weiter” to get the next slide in a presentation would be useful. Or whistling. I believe it would also be helpful for impaired computer users.

So here is how I thought it could be done (and looks good so far):

  1. make time slices (around 0.01s long)
  2. a FFT gives a characteristic power curve P(f) at this time t

So for one word/command/sequence you have a P(t, f).

Next, you see P(t, f) as a instance of a Gaussian distribution N(mean_freq, sigma_freq). You can determine a experimental mean+sigma for P(t, f). This is the command template N(t, f, mean_freq, sigma_freq).

Checking whether a curve fits can be done by multiplying the probabilities of P(t, f) occuring for N(t, f, mean_freq, sigma_freq) and using a lower limit.

This method filters out noise because the sigma will be very high in these frequencies.

Tackling offsets/shifts

The starting point of a command obviously unknown. So far, we can detect the probability of the current curve fitting a slice of a command template.

So how can a command be detected? By continuously following the probabilities (i.e. following the potential command in time, comparing with each command template), multiplying probabilities, and dropping out those that fall below the probability limit.

Tackling frequency shifts and frequency and time dilatation

Obviously, frequency shifts and variations in frequency and time can occur. This can be assumed to be of first or second order:

f_real = f_shift + f_orig + f_increase * t

Starting with f_shift = 0 and f_increase = 0, we can change these parameters (while staying within a certain limit) while the command fitting probability improves.

The same can be done for time dilatation.

This is very similar to squeezing and shifting a 3D-surface to compare it with another 3D-surface. The best fit still has some non-fitting areas. A weighted integral gives the difference and is compared to a maximal difference.

Of course everything has to be normalized, etc.

How does speech recognition work today?

They used to use time variation, but then they went on to hidden markov-models. Phoneme are modeled and compared against example speakers for each language. More in the wiki article

This is obviously a technically superior approach to just comparing frequencies, but at the moment speech recognition is not here. Systems like sphinx are not easily installed and trained.

I think people would prefer a

  • system they have to train themselves
  • that always works
  • but only in the scenarios/environments they need it (which it was trained for)

to

  • a system they can partially train themselves,
  • but does not reliably work in
  • for all scenarios (general purpose)

The former is what I’m playing with, the latter are available systems.

1 Comment

NZ – First day in University: Ubiquitous computing

First day in University started just as the University in Vienna: Sleeping until 12, breakfast, then show up at some point for the lecture (3pm). Well, they’ll move it to 9am, so that’s that.

Ubiquitous computing: Small computers you use but aren’t necessary aware of. They are just there, all the time, carryable (or not), multipurpose. PDAs and mobile phones are the more obvious ones.

For a research project theme I’m thinking about doing research for store-and-forward protocols using inhomogenous, non-persistent networks.

Imagine walking through a busy street, people having mobile phones, PDAs, etc. Imagine you would have fast means of connecting these devices (e.g. BlueTooth, Wireless) and send short messages in the seconds while people pass by you. A message can hop from device to device to ultimately reach its destination (a phone, the internet, …).

Think of people making the net rather then the net coming from outside and people being attached to it.

For the technical side: Surely only short messages are of interest. But these can be any kind of data (e.g. TXT aka SMS, pictures). Every phone can yell “I am 123″, “I am looking for 234″. Assuming phones have some storage, the devices can build up a tree of phones they have seen (some very dynamic routing protocol). Messages can then be passed (or copied) in the hope they’ll reach the destination in this direction.

For the security side: The data may be encrypted. The phone numbers for routing can be hashed.

But think of a twitter network available even after an earthquake destroyed the mobile network transmitters. Some devices may support GPS, so positions can be approximately recorded. This might ease finding people (if they want to be found). If the net gets sufficiently dense (think Manhattan), a really efficient network might emerge where you can reach people in real-time.

Obviously these thoughts have issues, but I’m interested in what research was done by people in this area. I’ll not make research on this myself, and probably not doing field-testing or programming (well maybe routing protocols on simulators).

1 Comment

Gnome Services

GNOME Services (ssh tunnels, …)

A GTK-Panel-widget for watching user-services. For example, I use ssh tunnels for pretty much everything (mail most importantly).
It should provide functionality to

  • Test if service is still alive
    • Test program: expected return parameter
    • Interval
    • don’t wait forever for the test program
  • Restart/Start service
    • automatically on start?
    • automatically if test program fails
    • manually

These are just the notes I have here. Python+PyGTK would be a good choice for implementing. Should be fairly simple.

No Comments

Mail client securing tool

Most mail servers provide encryption today. However, because people prefer to have stuff working rather than secure, and secure unfortunately very often comes with not working out of the box, they end up having insecure configurations (like in mail clients). And, more importingly: They will never change them, because stuff works. Also, the mail provider will never be able to shut down the insecure service, because it would break stuff, he’ll lose clients, and they will think the provider is unreliable.
Welcome to the world of the internet. This is the essential reason why we still don’t use encryption. Except for HTTP, there we couldn’t use encryption everywhere for cpu speed and caching reasons.

Enough rambling, Johannes, what is your project idea?

A tool that analyzes the mail client configuration. It checks if the mail server provides a more secure way of connecting (“port scanning” like). It tests the more secure configuration. If successful, the user is given the option to automatically change the configuration.
This should have read/write backends to Thunderbird, Outlook, etc.

Also, it could advise the use to change from Outlook Express to <Your favorite>.

Another feature would be to automate the process without a UI. Sysadmins could deploy and run it on a whole client network.
Deploy is a funny word by the way…

No Comments

POP server editor

POP server editor (online message management)

POP servers hold a list of messages. Modern mail clients usually just download and delete everything from the server. Or you can set them to leave the stuff on the server.
It would be interesting to make a tool that allows you to view the messages on the server (headers), optionally display the whole message, lets you selectively delete messages on the server. Maybe with a search function. It should not replace a mail client or try to take its functionality.
POP is a very very simple protocol, so this should be a easy one.

No Comments

Project Ideas

I’m going to start a new category. Cool project ideas that I came to think of (sometimes even with more technical technical requirements and design guidelines). Maybe I’ll pick them up to implement, maybe you will if you don’t know what your next project will be…

No Comments