http://blog.comsysto.com/2013/04/25/data-analysis-with-the-unix-shell/
It is also possible to make joins in the Unix shell with the command called join. The join command assumes that input data is sorted based on the key on which the join is going to take place. You can find another dataset on github which contains countries. This dataset is a comma separated list as well. The 14th column in the country dataset represents the capital id which is similar to the id in the city data set. This makes it possible to create a list of countries with their capitals.
| 1 2 3 4 5 6 7 8 9 | bz@cs ~/data/ $ cat city | head -n 2 1,Kabul,AFG,Kabol,1780000 2,Qandahar,AFG,Qandahar,237500bz@cs ~/data/ $ cat country | head -n 2 AFG,Afghanistan,Asia,Southern and Central Asia,652090,1919,22720000,45.9,5976.00,,Afganistan/Afqanestan,Islamic Emirate,Mohammad Omar,1,AF NLD,Netherlands,Europe,Western Europe,41526,1581,15864000,78.3,371362.00,360478.00,Nederland,Constitutional Monarchy,Beatrix,5,NLbz@cs ~/data/ $ join -t "," -1 1 -2 14 -o '1.2,2.2' city country | head -n 2 Kabul,Afghanistan Amsterdam,Netherlands |
No comments:
Post a Comment