Thursday, June 2, 2011

Automated Financial News Understanding System

I've just had the great pleasure of finishing up the http://www.newsmental.com news analysis, aggregator and community opinion formation web 2.0 system. That's right the system performs these 3 main things:
  • News from numerous sources is extracted more than 6 times a day and clustered based on simillarity of the article's text, this provides a nice overview of all the news, and in adition trending news are highlited, right at the top of the page. This sounds a little bit like news.google.com but there's much more to it, read on to find out... :-)
  • News is automatically analysed using some pretty heavy text-mining AI techniques in order to extract entities, places, people, facts, relationships and other semanitc / meaningful elements from the articles. These extracts are presented in each news-item's panel so that the user can avoid having to read the entire article - a simple lookup at the entities and relationships will provide all the quick info needed under time pressure!!
  • Most interestingly the newsmental system allows any user (logged-in or not) to rate the sentiment and impact of the news-article as you perceive it personally. This is something I would call - collaborative news analysis ala web 2.0 style! Eventually you won't be reading the news alone but every visitor to the site reads the same news-items, why then, not share the individual news understanding with the community, to help understand the news even better, than you would on your own maybe...
Let me mention that newsmental is a research project in community opinion aggregation research for my PhD, and is hence an academic study with noble research aims at better understanding collaborative news analysis. So in summary newsmental is a system that can potentially save you a lot of time keeping up with all the wide business, economics and finance news. This tool could be quite useful to traders and similar professionals where being aware of news and their overal implications plays a major role!

Thanks for reading all the way, to finish off the article I provide a few useful links:
[1] Quick (2 minute) Guide - http://www.newsmental.com/tutorialintro.aspx
[2] FAQ and some background info: http://www.newsmental.com/faq.aspx (here you can leave your email behind if you are interested into the outcomes of our study)
[3] Full Tutorial - http://www.newsmental.com/tutorial.aspx

Wednesday, May 11, 2011

AJAX UpdatePanel in ASP.net fully explained!

Tip: avoid the ASP.NET update panel whenever you can!

Update Panel is a quick and (very) dirty way to enable some AJAX on an asp.net webpage. Simply put one or several Update Panels onto a page, with one scriptmanager for the page, usually you'll want to set UpdateMode of the Update Panel(s) to "Conditional", and in case you have user-controls on your page, you might need to set EnablePartialRendering="true"  (this is by default set to true I believe) and it often seems to work just great, you get those famous flicker free partial page postbacks that are soo characteristic of AJAX. Unfortunately under the hood the updatepanel causes a full reinstantiationation of the Page’s control tree and every single control runs through its life cycle events.

Problems

I can understand that this abstraction offers certain amount of familiarity and simplicity that maybe some naive programmers will welcome very much, however it is misleading and utterly non-"ajaxy". Just imagine that you generate request and other parameter specific HTML for the same resource on the fly (this is altogether not at all uncommon with many dynamic web 2.0 applications), and you need some AJAX functionality on that HTML-page then the update panel will be an utter nightmare. Since the so called AJAX update-panel actually re-instantiates the entire control tree and runs plugs into the page-life cycle behind the scenes so that all the control events can be accessed nicely from code-behind, your dynamically generated page would need to be re-loaded from the viewstate or session state manually (kinda sucks)!!! See this post, but especially this post to illustrate the extra overhead on your side to achieve this.

Try... get this :-)

Obviously this seems too much work on server side, when all you wanted to do is send/retrieve a little bit of data to your web/db-server asynchronously. The whole idea of AJAX is that you update a small area of the page that needs updating since most of the HTML can stay the way it is a lot of bits on the wire & server processing time can be potentially saved. The side effect of which is a flicker free, quick, responsive web-page. With the asp.net update-panel it seems the main goal of the control is a flicker free update. I found a post that highlights the common mistakes with the update panel where some comments sadly point out the misleading opinion that this is a down to earth logical design. The truth is that once you know what the update-panel does exactly you can live with it, in some basic situations it might be quite alright to use it, but  it certainly isn't good AJAX by design by any standard.

The illustration below illustrates the desired AJAX scenario:


So problems begin if for example you generate controls dynamically based on the first page load, or by user-interaction, this is very common in todays dynamic web. If an update panel is used in such a common scenario then it is necessary to keep track of the controls that have been generated - usually this has to be done in the viewstate, and the framework doesnt do it for you, you do have to code up the viewstate state preservation (i.e. saving / retrieving from viewstate at the right time of the page lifecycle) yourself. This can bring a great deal of unexpected and most importantly unneeded complexity.

Of-course you can decide to stick with the update panel [for some very, highly, extremely strange reason :-)], and you can take care of the state management of dynamically generated controls as it is described in this stackoverflow.com post, or this one. Have fun ;-)...

The Solution (page methods, etc...):

Fortunatelly we can simply use direct AJAX calls. As Microsoft engineers realised that update panel (in most non-trivial scenarious) simply sucks and provided us with alternatives, specifically page methods, these are great, essentially a webservice type of method that can be declared as a static public method in my webpage class, raher than having to create a new web-service to expose the method. Page-methods allow to keep code in one place and I love them. Data is by default returned in JSON, but the format can easily be changed to XML for example (since JSON, isn't capable of representing certain complicated self-referential data-item). Check out this page for a good example of pagemethod in use.... Of course standard webservices can also be used, the options are discussed in some detail withing this great MSDN Magazine article written by Jeff Prosise on some options, other than the UpdatePanel.

JQuery or for that matter any other ajax supporting javascript library can be used instead (quite easily) to take care of asynchronous server communications, the guy from Encosia shows in a neat short article how to do this in jQuery - check it out.

Finally don't forget that if you use any postback controls, such as HTML Buttons, or ASPButton, ASPLink, the OnClientClick must contain something like "return false;" otherwise a page post-back occurs anyway as the server-side generated button click-event triggers. If you follow up these resources above, you will find that using AJAX instead of the update panel is actually very easy once you've done it a few times.

Conclusion

In conclusion update panel is nasty, it costs a lot of bandwidth and a lot of control is lost due to the nature Microsoft decided to hook it up with a pages's lifecycle. Some of that control can be regained by using the client side page-scrip-manager object as described on this page, however it doesn't resolve need for manual state-management of dynamically generated controls!

Sunday, April 3, 2011

Removing a Fake-antivirus / Spyware

So this Saturday I noticed that my Eee PC Acer Netbook-laptop (running windows-XP) got infected by one of those nasty Fake-Antiviruses. Having worked for over a year @ the Loughborough UNI PC-Clinic, I knew straight away what to do, but this one was a nasty one and it did take me over 6 hours to clean and repair my system. Having a complicated programming environment set-up on my laptop I really didn't feel like re-installing everything, and I decided to identify and eliminate the virus carefully. I briefly share my experience here since it could help some other poor soul with the same problem.
  1. ...as soon as you notice the annoying pop-ups and fake/suspiciously-looking security centre warnings, restart your system and boot it up into "Safe-Mode with Networking". On most laptops you need to press/hold F8 to get the screen from which to choose Safe-Mode...
  2. ...the problem usually is, that the spyware will de-associate .exe files and you wont be able to start any programs, such as command line, reg-edit or even a browser (browsers start-up page and proxy-settings are also changed, so be careful to fix your browser settings). If you are lucky safe-mode will prevent the spyware running, but in my case safe-mode didnt help. But there's a surprisingly simple trick: If you have numerous accounts on your XP system and you usually use only one of them select the one that you use rarely or never (quite often this is the administrator account), and if you are lucky it turns out that the anti-virus will not have infected that user-account!
  3. ...I was lucky that my Admin-user account wasn't infected and from there I was able to manually look for the process in windows task manager and search my system for such files and delete them manually [be careful not to delete system files - also most likely you will have to enable the viewing of system files under win-xp].
  4. ...in safe-mode I was also able to run the following tools: Malwarebytes, SpyBot Search & Destroy and SUPERAntySpyware. The reason for running "Safe-Mode with Networking" is you can connect to the internet to update to the latest anti-spyware/virus definition files. If a connection to internet cannot be established make sure to install the latest versions and for SpyBot Search & Destroy you can install the latest definition files separately, which is very handy. Run all the tools in sequence (rebooting - back into safe-mode [this is important!!]) Each software found different elements of the spyware and were able to remove most of it. This will take a lot of time, each scan can take much more than an hour (I tend to set the process priority for the scan to "Real Time" as the OS scheduler this way allocates more CPU time to the process & the scan will run quicker). 
  5. ...once I was relatively sure the system was clean, I run the fullest possible scan with SUPERAntySpyware again, this detected a few more issues and only when I was pretty sure the system could be clean I then booted up normal win-xp.
You might be done now! - but in my case my exe file associations were still broken (the fake-anti-virus devastation it left behind). In order to fix-this you can manually edit the registry, download registry entries to merge with your registry or simply run a tool for XP which is what I've done this time and worked like a treat.

I recommend you also do your own research, I found many useful articles online, such as this one, and depending on the version of spyware/anti-virus you might need to take a slightly different approach. Good luck! 

Wednesday, March 23, 2011

My Research Survey

In my blog post from February 2nd 2011, I mentioned several resources for constructing proper and effective questionaire surveys. After an initial draft version I have now put up online my anonymous online survey. Please feel free to visit and help me with my research by filling out the questionaire at www.newsmental.com/survey.aspx.

I designed the survey by following some of the principles mentioned in my blog post from February 2nd, but most of all I made an effort to keep the survey as short as possible (it only takes about 1 minute to fill out) as I really don't like to fill out 20 minute long surveys. So it should only take about a minute to fill out, if it doesn't or if you have any issues, comments feel free to leave me a message.

Saturday, March 19, 2011

Krispy Kream - Doughnut Sale

This tuesday I have helped organise a campus-wide charity Doughnut Sale for a charity, aiming to help bring children from Betlehem to the UK for a cultural exchange stay, through the registered charity Betlehem Link.


The group is called the Hakaya band and is a Palestinian folk dancing and singing group which represents the Palestinian’s life style through songs and movements to the Dabka [the traditional dancing style of Palestinians danced at celebrations and festivals].

We've set-up two sites on campus and sold Doughnuts. My friends from the "Friends of Palestine" deserve all the credit since they did a fantastic job, selling the doughnuts and comming up with the idea in the first place. Krispy Kreme runs a charity fundraising branch of it's business, that allows one to resell their fresh doughnuts for a mark-up with the profit helping your cause. Fortunately Krispy Kreme seemed to be a very popular Doughnut make with the students on campus, so it sold relatively easily.

I must say that being involved in raising money for charity, where you know that the money will have a real impact (i.e. help cover the travel costs, that would otherwise be unnafordable) is just really wonderful. We sold well over 800 doughnuts and I am absolutely certain that the proceeds will make a real difference to these kids!




Friday, March 11, 2011

8.8 Earthquake & Tsunamy - Japan

A horrible 8.8 earthquake has hit north-eastern Japan with waves of Tsunami across most of the Pacific Ocean, most hardly hit are coastal areas in north-eastern part of Japan.

This graphic shows the energy flow of the Tsunami through the Pacific ocean [note: the coast of chile might be hit surprisingly hard, according to this model's estimation].
Lives have been lost, and many people are still missing...

Google has also made available their person finding tool:

Saturday, February 5, 2011

Data from the Wolrd Bank

There is a wealth of data-sets pretty much about every country in the world available for free from April 2010 by the World Bank. Data ranges from agricultural use, educational / medical standards to financial development indicators. I discovered this resource while looking for some macroeconomic data for one of my PhD experiments. Weirdly enough I ended up browsing the huge data-set for hours and had a great time discovering the various differences in education, health and economic production between countries. For example I found that people are very keen about education in Kazachstan, in certain years even more so than in France - play around by selecting different countries.



Data from World Bank, Martin's Blog :-)

You can even compose cool widgets like the one above!

I personally most enjoy browsing the data by the actual available indicators and getting to know what they mean. I found this to be a great resource and hence had to share it with my readers and btw. there's also an API for any developers out there, looks pretty neat.

Other Institutions have also opened up much of their data-records:

My favourite book on many of these indicators is the book by Richard Yamarone - I read parts of it and it is an enlightening read, Richard provides a complete view of each discussed indicator, it's history, it's derivations, computation and uses. Highly recommended!

Wednesday, February 2, 2011

Constructing Surveys

Today I was looking at survey construction and survey design, since I have to prepare a survey. I intend to use the results of the survey to support a specific argument in my PhD Thesis. It isn't a large part of my work, and hence I don't want to spend too much time on this, however not surprisingly I found a lot of excellent sources on survey construction (whether offline, online, free form - interview, or closed form, see this link for a wider set of survey definitions).

An excellent introduction to surveys in social sciences, detailing the different types of answer collection (i.e. likert scales, guttman scales, semantic differentials, ranking and filter/contingency questions) can be found here. Some example questions are provided here. Maybe most useful to serious and academic-level research, are these resources: 1-a journal article detailing the stages of interview construction in a systematic manner, 2-a report for the US-Census Buro (2006) on question types and question styles (e.g. mentions academic research that argues against using "Don't know" answers in interviews, and reasons why that is the case).

Monday, January 31, 2011

My Personal Page

I just set-up my personal page (my blog is still here) on the lboro.ac.uk server, feel free to pay it a visit - http://www-staff.lboro.ac.uk/~comds2...

Thursday, January 27, 2011

Fuzzy Logic (Building a Fuzzy Inference System)

Boolean Logic has been around for many years now, however "Fuzzy Logic" is a somewhat more recent "beast", Prof. Lofti Zadeh proposed fuzzy set theory back in 1965.

In this post I want to show online resources that will illustrate that Fuzzy Systems can be simple and most of all are very elegant way to solve some problems.

I had fuzzy set theory in one of my data-mining modules during my university time, studying comp science. It was straight forward but then I never had to make much use of it (only a tiny bit in my PhD). So anyway, let's just jump into it!

  1. A great way to start is to work through a real illustrative example, in which a Non-Fuzzy solution is explained, and then the Fuzzy solution is introduced and is shown that it actually does work better... <<This page does exactly that>>
  2. Get this by looking at many more examples... <<Here>>
  3. Try to build a Fuzzy Inference System yourself - based on the idea of "learning by doing"... <<this can be done here>> (note: this is Matlab based, but this doesn't matter at all)
  4. Research papers making use of Fuzzy sets might be usefull cheap option to learn more (<<for example>>), check out a book (or two) and play with a relevant code library.
Software Libraries

Python: pyFuzzy, peach || Java: RockOn Fuzzy, Funzy || C#.net: DotFuzzy

Great Books

Monday, January 17, 2011

My New Year's Blog revamp

With the new year I decided to change my blog-up a little bit. Essentially the design template is now different, not so heavy on the eyes I hope. Secondly the direct address changed to http://martinsykora.blogspot.com, alternatively http://www.martinsykora.com is still usable.

On a different note. Each month I receive an email from NBER (The National Bureau of Economic Research) with most recent summary of published/working papers by eminent economic academics. Often I just ignore these emails, this time I saw one article that caught my attention thought and I'd recommend it as a great read to anyone interested into the general research method within science. The paper is entitled "Economics, History and Causation" and the authors suggest that a number of qualitative reserch methods should be more at the centre of an overly statistical approach to economic research (in particular since authors are from that field, however their argument extends equally to the field of, say empirical computer science).

The paper can be downloaded here http://www.nber.org/papers/w16678.pdf