Saturday, June 30, 2012

Alpha update: file associations, plugin management and SoundCloud

Time for yet another update to the alpha channel. This one contains some really major cool stuff.

File assocations

Stoffi will now show a task dialog when you run it the first time. It will let you either set Stoffi as your default music player (associating Stoffi with all extensions and URL protocols it can handle), choose yourself which associations you want to do, or skip the step entirely.

Quickly make Stoffi your default music player.

In addition to the usual file extensions such as mp3 and ogg Stoffi will now also register itself for some URL protocols. These includes playlist://, soundcloud:// and youtube://. This will allow us to post links and share music online and have it play in Stoffi with just a click.

Plugin management

Manage and configure plugins.

Plugin management has been improved with a dedicate page on the control panel where you can see all installed plugins, install new plugins or uninstall them. Plugins can now also define some settings which will be displayed when the plugin is selected in the list.

SoundCloud

Stream music from SoundCloud.

You can now stream SoundCloud music directly in Stoffi! This one is really awesome and has made me discover SoundCloud for the first time. There's some really cool indie music there that I think a lot of you will enjoy. It works pretty much the same as our existing YouTube streaming instead it won't show a video. 

Radio stations

Listen to Internet radio.

I also added Internet radio streaming. You can add radio stations by pasting the URL. I plan on adding some stations which will ship with Stoffi so you can get access to some fine music streaming directly out of the box.

Other noteworthy stuff

What's the average size of my music files?

A pretty nice feature is that you can now check out the average length of the tracks in your Party playlist, or check out the total size of your music files. The details pane has been improved to show details for all items in the navigation pane.

Wednesday, June 27, 2012

Project Dance Controller: Report 5


This post is part of Project Dance Controller.

It's a bachelor thesis project with the aim of letting the quantity of dance movements in the room control the volume level using the Kinect for Windows hardware.

You can read all reports here.

My Internet connection was down so I didn't post on Monday as usual, that's why this post is a little late.

Last week

Last week I finished up the quantification of the movements in the depth frame. I've had the opportunity to test it out with three people and calibrate it to give some nice values.

The number is first derived from the depth frame by looking at each pixel and counting those where the depth differs by a certain amount, ignoring pixels with a value of -1 which indicates an unknown depth (due to problems with the surface and the IR laser). That gave me the percentage of pixels of the frame where movement had occurred.

Challenges

I first tried to achieve some sort of S-shaped graph using the cubic root of the above value. This would mean that small movements didn't give too much effect and after a certain threshold the graph would slowly approach 10. But I never got it quiet right.

I also tried using a fraction graph leaving the independent variable as the denominator but that gave smaller movements to much effect, while using an exponential graph made it hard to have any effect while dancing alone.

So in the end I ended up with a simple, linear graph.

There was also the problem of the value changing very fast. I decided that the best approach would be the same one that TCP uses when it estimates the RTT value, namely using the formula old * a + new * (1-a) where a is a value between 0 and 1. The higher the value of a the slower the value will change, making it more resistant to temporary spikes and fluctuations (just as in the case of TCP and RTT). This worked really great.

This week

I will continue to further test out the stuff with different amount of people and distances. I will also add some of these calibration values as properties so they can be configured by the developer. I will also try to get some more documentation done and write a little on the report.

If there's time I will try to start adapting the plugin system in Stoffi so it can handle plugins which manipulate the volume.

Challenges

The biggest challenge right now is finding time. But that's expected when you have a 5 month daughter, it's summer, I have two weddings to attend to and a trip to Skåne. But I'll do my best. The focus will be on writing on the report as well as improving the calibration and documentation of the library.

Update: cloud services, social sharing, donations and much more

Finally! It's here!

Yesterday I pushed out the new website and just a few minutes ago I released the latest upgrade of Stoffi. It contains mostly under-the-hood changes but there are some pretty neat features worth highlighting. Here's my favorites:

Cloud services

The biggest change has been on the website. It is now possible to register an account either by filling out a form or by logging in with either Facebook, Twitter, Google, SoundCloud, Vimeo or LinkedIn. The registration is very simple and easy. There's no confirmation step, no gender, no address, no security questions; if you use the form you only need to provide an email and choose a password. I've also kept security as a main focus when designing the account system.

The website has also gained the ability to have other websites or applications connect to it. For example in Stoffi Music Player you can now login to your account on stoffiplayer.com and access that data. This opens up some really cool new stuff. For one you can now synchronize some settings between different Stoffi Music Players like your shuffle and repeat state or the volume. You can also login to the website and remotely control all players which are hooked up to the synchronization. This let's you see what song is currently playing, increase the volume, pause or skip to the next track.



The plan is to further expand the settings that can be synchronized. My goal is to have playlists, queue and history in the next version. Then expand further in the future with shortcut settings, equalizer, list configurations (sorting, columns, search, selection). The remote control will also be enhanced so you can browse your music and select which to play or queue, create playlists, etc.

Social network integration

If you have created an account and connected it to Stoffi and either Facebook or Twitter you can now share YouTube tracks with your friends. Just right-click the song and there will be a "Share" item.



In your account settings (accessible both via the website and inside Stoffi) you can choose where your share will appear (Facebook and/or Twitter).

This feature is pretty basic and I plan on adding the ability to attach messages to the sharing and select on a share-by-share basis on which services the song will appear.

Donation service

My favorite new feature is a new donation service where you can send money to support any artist you want. This cuts out most of the usual middle men and gives the artist the biggest share of the cake. The share of the artist is by default set to 80% but you can set it to anything you want. We also, by default, set aside a 10% share for charity.



The system uses PayPal and we don't save any sensitive information at all. 

Other noteworthy improvements

I have also added search indicators which show you where a search is currently active. This helps you spot why you're for example only seeing Eminem music in your collection of over 4,000 tracks. It also further illustrates how our three different search policies work.



By popular demand I've also added a playlist generator which let's you generate a list of songs by choosing randomly a given number of tracks from a larger set. This is very useful if you want to create a playlist to export to your MP3 player or burn to a CD for you care. In the next version I will add the ability to have the limit set in total size or total length instead of just number of tracks.

I am also very proud to announce a new language of Stoffi: German (thanks to Tom). There's some other minor enhancement, a ton of bug fixes and some improvements to stability and performance.

Sunday, June 24, 2012

We support Do Not Track

Good news, everyone!

I have just added a few lines of code to our upcoming website which will turn off our Google Analytics tracking if your browser sends out the DNT (Do Not Track) header. For more information on this feature and how to enable it in your browser you can checkout the DNT website.

Monday, June 18, 2012

Project Dance Controller: Report 4



This post is part of Project Dance Controller.


It's a bachelor thesis project with the aim of letting the quantity of dance movements in the room control the volume level using the Kinect for Windows hardware.

You can read all reports here.

Last week

The last days I have been busy trying to smooth out and fix the depth image that I get from the IR sensor. I have successfully interpolated missing pixels and applied a bitmask to isolate only bits indicating the distance to the object.

As a start here's a picture which has been shifted three bits. I have also translated the various depths into the colors blue (far away), green (middle) and red (near). Black areas are missing pixels. This is where the IR laser isn't able to determine the distance due to the material absorbing, refracting or diffracting the light, preventing it from reflecting back to the sensor. I also encoded pixels which have a higher depth than the max depth into white. There are no such pixels on this picture but they appear if I aim the Kinect toward something that's farther away than 4 meters.

A raw depth image.

Removing white pixels is very easy. All I do is set them to the max depth (4000) and they will appear blue. The black pixels however are more difficult. Here I decided to use the nearest neighbor interpolation algorithm to determine the value for each black pixel. What I do is that I start by looking around the pixel for any non-black pixel. I extend my search farther and farther out from the pixel until I find a pixel with a correct value. When I do I just give my black pixel that value.

An interpolated depth image.

This produces some artifacts since the scanning is linear (left-to-right top-down). The result is a lot of blocking and this actually creates more "noise movement" than the original depth image.

The interpolation plus bit shifting is done in 8-9 milliseconds (ms) which is pretty fast and allows me to process images at a rate of over 100 frames per second (fps).

I also tried to apply a mean filter after the interpolation to smooth out the blocking without removing edges but that cost some serious amount of CPU cycles and increased the time to over 30 ms giving me less than a 30 fps rate.

Challenges

The biggest issue here is that the interpolation may remove "pixel noise" (pixels with unknown depth) but introduces more "movement noise" (the interpolation algorithm switches some blocks of pixels from green (around 2 meters) to red (around 30 centimeters). That's a lot of movement that does not actually occur.

So as much as it hurts I may have to skip the interpolation and mean filter all together and declare this week a week I just learned why certain techniques does not fit the purpose of my code. I should not see this as a failure, instead this week was the week I did interpolation and mean filtering for the first time, learning new and well-known algorithms in the field of image processing. However, neither of those algorithms works for me so I will just use the raw depth image when I move onto the next step: analyzing difference between two frames.

This week

So this week I will try to actually get a value between 0 and 10 out of the images. Even without interpolation there's still some noise movement so I need a way to remove that. I was thinking of dividing the image up in squares and analyse each square separately. To remove noise movement I will take 3-5 frames and calculate the average movement between the frames, then compare it to the average movement of the next 3-5 frames. This should keep my algorithm fast enough while still smoothing out movements caused by missing pixels.

Challenges

The biggest challenge will be to account for the false movement of the missing pixels while still being able to do the processing fast enough for it to both feel responsive for the user and properly detect real movement in the picture.

If I have time and ability I should try to make the algorithm not give false positives when the camera is moved but I consider that low priority right now.

Saturday, June 16, 2012

Beta update: minor stability fixes

Time for yet another update to the beta. This time I've fixed a number of smaller bugs that have been found in the beta. Most notable is an added buffer to the synchronization which increases performance and stability in the sync system.

There's also some improvements to the volume slider which can now be used easier with the arrow keys. A rare crash during YouTube search has also been fixed as well as a crash when a library was changed outside Stoffi and for some reason Windows wasn't able to determine the library's type.

I've also fixed a crash which occurred if you dropped a bunch of files into Stoffi from the Explorer window when there was a scan already in progress.

Most of these bugs are small and/or rare which is a clear indication that the beta is becoming stable. Hopefully we can release all these new changes into the stable channel very soon. Please help us test out the new stuff, especially the new cloud stuff: sync, remote control, login, registration, facebook/twitter/google linking, etc.

I am also preparing a new update of the alpha version of Stoffi which will introduce quite a lot of new, fun stuff for you. Stay tuned!

Monday, June 11, 2012

Project Dance Controller: Report 3


This post is part of Project Dance Controller.

It's a bachelor thesis project with the aim of letting the quantity of dance movements in the room control the volume level using the Kinect for Windows hardware.

You can read all reports here.

Last week

I have successfully detected the presence of the Kinect device. My events Connected and Disconnected fire when the device is plugged or unplugged respectively. I also managed to get in some code to control the motor and retrieve the depth image from the sensor.

Regarding the exposure of the raw data there is a sensor class which I can just forward in a property to any caller so they can manipulate the sensor directly. I have not yet added that code but aim to do so this week.

Challenges

I did not really have any problems, even though I was pretty sure last week there would be some. It's a really nice surprise when stuff actually just work out great.

This week

This week I will start to actually analyze the depth image. The data is a simple array of shorts. The array is of length FrameWidth x FrameHeight (640x800). There's several things I need to do before I can even start with the analysis. First of all the 16 bit shorts have three bits which is used to identify "layers" where players are detected. This detection is not very good and mostly only work in ideal conditions. So I will need to discard those bits by shifting the short three bits. The remaining 13 bits is the distance to the point in millimeters.

Next there's a lot of "holes" in the depth image due to the infrared laser not being able to detect certain surfaces. My coffee table, my hair, glass bottles and my shelf are some of the things which create these holes. However, these holes don't have a very clear border or edge, they kind of flicker. This creates "movement" in the image even though nothing is actually moving. I need a way to remove that "noise movement" from the image.

Challenges

This preprocessing will create some overhead in my calculations for each frame. After reading up on the previous research done with Kinect depth images it seems that it will be hard to actually be able to process around 30 frames per second if I intend to continue on and detect heads as well. As an example one of my references which detects heads and then continue in that layer to extract the body (with mixed results) runs in over 27 seconds for each frame. That is totally unacceptable for me. My analysis can not take more than 60 milliseconds.

So the solution will be to again revise the plan. Instead of detecting heads I will just process the depth image this week and prepare it for next week when I will detect movements using two images. I believe that if I relate the movement to the distance I don't need to know the number of people in the room.

This should also help me get the whole analysis done in a fairly fast manner and let me keep a high bitrate from the sensor.

Friday, June 8, 2012

LinkedIn and eHarmony: good password security is not that hard

In light of the recent breaches of the two very popular websites LinkedIn and eHarmony, and Last.fm investigating a password leak as well, I want to remind everyone to practice good password security. I also want to extend a small question to LinkedIn and eHarmony regarding their security practices: WTF?!

Tips for users

First of all, if there's only a single lesson that any user should take from all this mess it's this: never use the same password on two websites. If one of those websites is compromised your account on the other will easily be broken into as well.

Of course it's not very easy to remember a lot of passwords. You can either use a password locker like 1Password or divide websites into classes where you have different passwords for "important" websites like Google, your bank, Facebook, etc and then share a password on less important websites like forums you rarely visit. The former is of course more secure but it makes it a lot harder for you to quickly log into websites from a new computer.

Incompetent developers are dangerous

Now, when we've covered good password policy for users it's time to move over to the actual website operators responsibility. How can eHarmony talk about "robust security measures". First of all they did not salt their passwords. If you do not salt the passwords when you hash them you are doing the equivalent of putting duct tape around the door instead of locking it. With rainbow tables it's extremely easy to crack non-salted password hashes.

I would also want to know why eHarmony considers a load balancer a robust security measure. It's very apparent that these engineers are not very knowledgeable when it comes to Internet security.

But salting password hashes is not enough. One needs to use a strong hashing algorithm. eHarmony used MD5 which is old and dated. LinkedIn used SHA1 which is more secure. But even if you use a good algorithm and salt the hashes there's still room for improvement.

We take security seriously

As I've discussed before our upcoming account system uses a number of measures to further increase security. First of all we hash the passwords before they are sent to the server using the SHA256 algorithm and salting with the user's email. When the hashed password reaches the server it is again hashed using the SHA1 algorithm. This time we salt both with a random string which we store in the database but also with a random string which is stored outside the database. The reason we use a salt stored outside the database is because most password leaks only involve the attacker gaining access to the database, not the file system.

Furthermore the server side hashing is done several time over, not just once. This will increase the time it will take to crack the passwords by several orders of magnitude. Rendering virtually impossible to crack them.

Using all these things is not very hard. It takes less than a day to implement and should be considered common sense when it comes to security practices. Apparently there's lot of million and billion dollar companies out there that don't have the skill to properly secure their systems. Heck, Sony even stored their passwords and credit card numbers in plain text. That's beyond incompetent, that's dangerous!

But it can get even better

As a last note I want to mention two-factor authentication. Usually you need access to something you know in order to login to a website: your password. To get into your car or house you need something you have such as a key. Security measures could also include something you are (fingerprint, voice, retina scan). Two-factor authentication means you need two of these things. For example ATMs use two-factor auth since they both require something you have (card) and something you know (pin).

Website can use two-factor auth by requiring a password (something you know) and access to your cellphone (something you have) by sending you an SMS with a code to enter everytime you log into your account from a new computer. I would love to do this at Stoffi but sending text messages isn't free so for now we'll have to skip that. However, I recommend everyone to use it where it's possible, for example Facebook and Google.

Monday, June 4, 2012

Project Dance Controller: Report 2

This post is part of Project Dance Controller.

It's a bachelor thesis project with the aim of letting the quantity of dance movements in the room control the volume level using the Kinect for Windows hardware.

You can read all reports here.

Last week

I have created a skeleton library with some API calls which I have documented in code. I will use the generator from the Stoffi project to convert the specification comments into wiki pages so it can be read directly on the wiki. The code is available on GitHub.

I have also started to look through some of the code examples for dealing with the Kinect for Windows device in C#. It seems pretty straight forward to get the raw data. The tricky part will be to do something useful with it.

I've also uploaded the report as I promised in the last report. It's in docx format since it's a work in progress. I will create a PDF version when it's finished.

Challenges

There were no problems with the code but I did find some problems with the plan to quantify movements according to fine movements in POIs.

Since I studied the reference material I mentioned in the last report I've come to the conclusion that I should use the depth image for analysis. It provides better means of finding edges (and thus humans) than the visible light camera since it can handle strange backgrounds, weird colors and bad or non-existent lightning. The skeleton tracking is a total no-go as it can only track to people at a time and it's very much hit and miss unless the person is positioned right in front of the camera with all limbs visible, facing forward.

So I have revised some of the plans a little bit. I will not try to filter out movement such as walking or going from sitting to standing. Instead I will do my analysis in two stages: first I will detect heads and count the number of people in the room. I will then quantify the total amount of movement, normalize it to scale, and balance it toward the number of people in the room. The backside will be that funny hats or other stuff touching the head will create a false negative but since the number of heads is only to better interpret the meaning of total movement it will not affect the outcome too much.

This means that two weeks will change. Instead of identifying POIs (head, hands, etc) I will detect heads and the week after I will quantify all movements in the depth image and relate that to the number of people detected. I have updated the schedule accordingly.

This week

Today I will start working on detecting the presence of a Kinect device and try to control its motor as well as getting the depth image from it. I will finally start to get acquainted with the Kinect SDK and really get to know the device. Most probably, I will learn some stuff which will force me to rethink the API or at least extend it a little bit. That's why I've kept the API very simple and basic.

Challenges

The biggest problem I anticipate will be getting the device and my code to work together. It's the first time I've ever controlled an external device from code (never even read stuff from a webcam or microphone before) so there's a lot which can prove challenging. Hopefully though there's plenty of resources to guide me if I ever get stuck.