After configuring the four files core-site.xml, hdfs-site.xml, yarn-site.xml, and mapred-site.xml, start the cluster and run the wordcount sample program.

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output

The result is running error

Error: Could not find or load main class



Enter the following command under the command line and copy the returned address.

hadoop classpath

Edit yarn-site.xml

vim yarn-site.xml

Add the following content

         <value>Enter the Hadoop classpath path just returned</value>

Perform the above settings on all Master and Slave nodes


But I later thought about it. I have already configured environment variables in the configuration of yarn-site.xml. Why do I need to configure the path of the class here? The probability is that there is a problem when copying the configuration information. Observing the yarn-site.xml carefully, it turns out that there are two linebreak characters in the middle of the environment variable, which causes the environment variable to not be read normally, so the error that the class cannot be found before appears.


After removing these two newlines, save and exit, then synchronize the yarn-site.xml distribution to all machines, then restart the cluster, and then run wordcount to run successfully.


This draws a lesson: If you need to paste long content in the configuration file, it is best to open the command line window to full screen first, so that you can easily find out whether there are hidden newlines or spaces in the copied content, especially those Content that exceeds one line in a row

