A note for myself, the product has grown from v7 to v9 now.
Heard Hadoop for many times, never got deep into it. Now is a chance, so I started the experiment in Microsoft Azure.
First, you can find a number of Hadoop packages in Azure Marketplace. I chose Hortonworks Sandbox with HDP 2.5. I tried Hadoop by Bitnami as well, but it’s usability is a bit tricky, I couldn’t find a way to make Bitnami work without creating a number of accounts and expose more of my own information. I may try it later (and enable the boot diagnostic to find the password in the log when the image starts the first time) when I have time. For now, I stick to Hortonworks.
Then just follow the standard Azure procedure –
- Basics: filling in the VM name, username, SSH key or password, subscription, resource group, location, etc.
- Size: choose the size of the VM.
- Settings: choose the storage, network, etc. I suggest to leave boot diagnostics enabled.
- Confirm on the Summary and Buy.
Notice that on the price page, there is warning on the charge besides the Azure VM itself, also since the HDP Sandbox just showed 0.0000 CAD/hr, I don’t think you need to worry too much about it. BTW, Bitnami’s Hadoop is also free, explicitely mentioned.
Wait for a few minutes until the deployments succeeds. You can then check the status of your new Hadoop VM. Hortonworks suggests that you make the public IP static. You can find more detail information on its tutorial page.
Next is to configure your SSH client. I am using PuTTY on Windows, so there are more mouse-clicks than the config example given in the tutorial. Basically these settings let you connect to your VM in the Azure cloud using various ports from localhost to the remote VM via the SSH tunnel you set up here.
Here is how to configure PuTTY:
- Fill in the public IP of your Hadoop VM
- Expand Connection – SSH
- Click Tunnels
- Fill the source and destination, then click Add button
According to the document, you need to add 8 forward ports
So in PuTTY, you can add one by one, it should eventually look like this (scroll up and down to see total 8 lines/ports).
You can then go back to the Session page, give a name in “Saved Sessions” and save the configuration. Next time, you only need to load it from there.
One trick is that the VM need some time to start and become stable. My first few login attempts failed, only after 20 or 30 minutes can I eventually login. so be patient. After login, you should be able to see the following directories.
Then according to the tutorial, keep the SSH session active, you can use brower to visit this page on your VM.
Click on the left icon, you will see the dashboard.
Click on the right, you can read more advanced topics including the default username and password, and how to change them.
That is the first step into Hadoop.
Yahoo China detected my @yahoo.com account and gave me the reminder every time when I log in, but in the page it only mentioned @yahoo.com.cn and @yahoo.cn will be stopped.
My email is @yahoo.com, then what? The problem is my Yahoo email was originally registered in China. Here is another page that just says it, and no solution for that.
Also I see the “&source=alibaba&cnNoRedirect=1#mail” string in the address bar every time when I access my Yahoo email. There is no choice. I have transferred most of my liaison to another service.
It is the time to leave Yahoo! for good.
Today I heard they are Mobile, Social, Cloud, Big Data.
- Search/filter image results by colour(overall/background) and visual property (style, similarity).
- Boldface in results are Google associates (联想、近义词).
- Higher in results: Having the words or synonyms; Appear in the title or URL; Linked by high quality pages.
- Word order matters. Capitalization does not matter.
- Most special characters (¶, £, €, ©, ®, ÷, §, %, (), @, ?, !) are ignored in the query, except +, # $, etc.
- Find words in the page – Ctrl-F or Cmd-F, this is not search.
- Upper-right-hand side panel for search entity that is well-known; Search-as-you-type; Related searches at the bottom of the page.
- search “define keyword”, dictionary definition and translation, and Search Tools in the left panel.
- SERP – Search Engine Results Page; rollover preview; title/URL/snippet(abstract)/deeper-links.
- Operator site:domain, including top level domain where the dot can be omitted.
- Operator filetype:pdf, doc, docx, ppt, txt, csv, etc.
- Minus (-) operator to exclude certain keywords.
- OR operator and double-quote.
- Operator “intext:”.
- Advanced search – gear button.
- Search by image: drag-and-drop local file to image search.
- Search features – descript your query – geo, measure, time, flight, weather, movie showtime, etc.
- Conversions [number unit1 in unit2], also currency
- Left hand panel “show search tool” – date range limiting, custom range
- Translation – left hand panel
- Credibility … use time search
- Use Books
- Use WHOIS
[to be continued …]
From time to time, we saw attempts on proxy or firewall trying to go out for the following destinations:
Because the domain example.com is reserved in RFC 2606, those hosts don’t actually exist, so all the attempts failed. Consider the number of users in the network, how many resource had been wasted due to this kind of nonsense traffic? If there is a proxy configured, the client will periodically send requests to the proxy, the proxy then need to authenticate and process the request. If user has direct connection, the DNS need to resolve this non-existing hostname every few minutes. Think if there are 1000 users in the same situation in your network.
Here is the log that shown on the proxy server from one client, the attempts repeat every few minutes:
The question then became where the traffic are from and how to stop them.
SIP is the keyword, it must be from an instant messaging client. So on the client machine, we found only Office Communicator was installed but not configured ever since. The Sign-in address (URI) was the default email@example.com, and somehow it starts observing or connecting to all three hosts mentioned in the begining of this article.
I searched the Internet for similar complaints, most of those have three hostnames are Microsoft official documents – OCS Deployment Guides and Communicator Testing Guide. The domain example.com are real examples in those documents.
There is only one thread in Microsoft online community discussed about the issue. worb68_ocs brought up the same concern that I have. The only answer that closes to the root cause was from Turgay Ongun in Microsoft:
When you install the Communicator client and run it as the very first time, the textbox where you enter your SIP address is firstname.lastname@example.org
If by mistake, any user click the sign in button without entering his/her SIP address, then the communicator tries to find the edge server for example.com for email@example.com SIP address.
All other answers were not quite straight forward.
So the next step is either remove Office Communicator if user is not using it, or configure it by a correct sign-in address, or disable automatic login if it’s not in use all the time.
Microsoft should also do something in their next release of Office Communicator or Lync client, they should leave the Sign-in address blank or lead user by a wizard to put in some more meaningful address/URI instead of just dropping a example like firstname.lastname@example.org.
Here is the thread address:
Thinking about this for quite some time, because since MS closed down its Live Spaces and forced casual bloggers like me to migrate to somewhere like wordpress.com (or 新浪博客 if you or your audience are in China where their Internet is partially interconnected with the rest of the world), which also providing free service, but wordpress.com is just a very limited sub-set of WordPress sphere, which has everything in wordpress.org. How can I move to the full version of WordPress?
Grab a very old PC in my basement and start to install Windows 2003 on it. Why not Linux? LAMP would be the perfect package to build a modern web site. No, I had tried, no fun at all. I tried a number of different flavours of Linux in the last ten years, from the classic RedHat to the easiest Ubuntu. Everytime, Linux just made me give up. Maybe this will change in the next ten years. I chose Windows this time, I would use WIMP – Windows, IIS, MySQL and PHP.
If you go to the website wordpress.org, here is the direct way to get necessary information to install WordPress on Windows.
You don’t need to find a web host until you figure out how to install and how to make use of the WordPress platform. Look for “2, Download & Install WordPress …” and click. You’re going to a download page – yes and no, it is the right page but you are not download anything here. Find “handy guide” and click, we’re closer. In the Install WordPress page, click “Easy 5 Minute WordPress Installation on Windows”.
Very straight forward 5 steps, actually since it is Windows there are only two steps – download and install. The installation package is from Microsoft – Windows Web App Gallery. It will guide you through all the steps, even if you don’t have Web PI (Web Platform Installer), or PHP, or MySQL on your server. All you need to get ready is Windows and IIS. The document also provided the WAMP alternative in the next section if you prefer Apache than IIS.
I have my WordPress site running just after another few minutes of configuration.
Next, I am going to build a forum. After some research, I found there are a few open source forum code available, like Snitz Forums, Open Source ASP.NET Forum. The latter seems quite complicated and need MS SQL Server, so I just leave them for now. For its simplicity, I chose WIMP with MySQL and PHP. Luckily, there is a open source package – phpBB. It’s website phpbb.com provides everything you need, here is the document for Install. Also there is another site made the installation steps much simpler – Build a phpBB Forum in 5 Steps.
This time, it took a bit longer, but still in less than 15 minutes, I have my Forum running.
Interestingly, Microsoft has quite adapted to PHP, it even has a sub-site in it’s official IIS site – php.iis.net. For MySQL, we all know it became a different beast lately, it’s adopted by Oracle when SunMicro was bought last year. It may not be MySQL or your free SQL some time in the future. Anyway, I found another useful article about installing PHP and MySQL on IIS.
While I looked around in the php.iis.net site, something interesting caught my attention – a Gallery – Gallery Server Pro. Why not add a Gallery in my WIMP test lab? Because it’s also packed by Web PI, the installation is no brainer. In the next 15 minutes, I uploaded a few pictures and have my Gallery online in my home network.
By the way, my test environment is a Celeron 566MHz PC with 384MB RAM.