Some Links and more on Sqoop installation

what with the PhD work and starting a new job Teradata, I’ve been so busy learning lots of new exciting stuff that I haven’t had my writing head on this month. So here’s a bit of a copout and a load of links!!

Nice use case for Hadoop with some actual code showing a conversion process

http://hadoopinku.wordpress.com/2012/02/19/image-to-pdf/

it’s based on the famous New York Times job that used the Amazon platform

http://open.blogs.nytimes.com/2008/05/21/the-new-york-times-archives-amazon-web-services-timesmachine/

I’ve just joined a new group on LinkedIN

Hadoop User Group (No SPAM, Vendor Neutral, Actively Moderated)

it’s well worth joining if you’re interested in Hadoop, the old one was just too full of adverts for ipads, iphones and russian dating agencies to be useful!

and seeing as there is still a lot of searches coming in from Google with “Install Sqoop” Here’s some links to some other great blogs – if any of them work better than my instructions let me know :-)

http://blog.kylemulka.com/2012/04/how-to-install-sqoop-on-amazon-elastic-map-reduce-emr/
https://ccp.cloudera.com/display/CDH4B2/Sqoop+Installation
http://michael.otacoo.com/linux-2/install-hadoop-and-sqoop-in-lucid/
http://shout.setfive.com/2011/09/14/getting-started-with-hadoop-hive-and-sqoop/

Good Luck with it!

At the moment I’m ploughing through Mahout in action which is an excellent book, hopefully I’ll be able to post up some example soon.

 

 

 

  • HadoopinKU

    do you have bigger and better ideas relating to hadoop. i am about to start my thesis for my masters and i am just puzzled on deciding what should be done.
    take care
    HadoopinKU

  • chillax7

    Hi – it’s a great idea to do something with Hadoop for your masters, there’s not a huge amount of academic work published on its use. Maybe do a few pages on MapReduce as a programming technique – pros and cons. A comparison of Hadoop with a relational database, that’ll be good for at last half a thesis! then pick a nice use case, maybe with some of your Universities own data. I know at Dundee there were several departments with good cases, I picked the Life Sciences lab and the Mass Spectrometer output but there was a load of data from radio telescopes or the computing dept. could have a ton of network or web log info you could analyse… it’s a Masters so don’t go too big on scope you’ve only got 6 months, right?
    Good luck with it!