Java & mapreduce:org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0

created at 03-23-2022 views: 24

peoblem

In the project, local files need to be copied to hdfs. Since I am lazy, I use the Java program I am good at to achieve this through the Hadoop.FileSystem.CopyFromLocalFile method. Running in local (Windows 7 environment) local mode encountered the following exception:

error message

An exception or error caused a run to abort: org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Ljava/lang/String;JJJI)Ljava/io/FileDescriptor; 
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Ljava/lang/String;JJJI)Ljava/io/FileDescriptor;
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileOutputStreamWithMode(NativeIO.java:559)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:219)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
    at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:295)
    at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:388)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:451)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:430)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:920)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:901)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:798)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:368)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:341)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
    at org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(LocalFileSystem.java:82)
    at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1882)

By analyzing the exception stack, we can see that

org.apache.hadoop.io.nativeio.NativeIO$Windows.createFileWithMode0

The method has an exception,

The implementation of the createFileWithMode0 method is as follows:

/** Wrapper around CreateFile() with security descriptor on Windows */
    private static native FileDescriptor createFileWithMode0(String path,
        long desiredAccess, long shareMode, long creationDisposition, int mode)
        throws NativeIOException;

It can be seen from the code that this method is not supported by hadoop. So why call this method and continue to trace up through the exception stack,

The nativeio.NativeIO$Windows class is called in the org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init> procedure, and the corresponding method is as follows:

private LocalFSFileOutputStream(Path f, boolean append,
        FsPermission permission) throws IOException {
      File file = pathToFile(f);
      if (permission == null) {
        this.fos = new FileOutputStream(file, append);
      } else {
        if (Shell.WINDOWS && NativeIO.isAvailable()) {
          this.fos = NativeIO.Windows.createFileOutputStreamWithMode(file,
              append, permission.toShort());
        } else {
          this.fos = new FileOutputStream(file, append);
          boolean success = false;
          try {
            setPermission(f, permission);
            success = true;
          } finally {
            if (!success) {
              IOUtils.cleanup(LOG, this.fos);
            }
          }
        }
      }
    }

It can be seen from the call stack that the NativeIO.Windows class is called in line 8 of the above code. Then the if judgment should be established, and the analysis NativeIO.isAvailable method code is as follows:

/**
   * Return true if the JNI-based native IO extensions are available.
   */
  public static boolean isAvailable() {
    return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
  }

The isAvailable method mainly calls the NativeCodeLoader.isNativeCodeLoaded method

static {
    // Try to load native hadoop library and set fallback flag appropriately
    if(LOG.isDebugEnabled()) {
      LOG.debug("Trying to load the custom-built native-hadoop library...");
    }
    try {
      System.loadLibrary("hadoop");
      LOG.debug("Loaded the native-hadoop library");
      nativeCodeLoaded = true;
    } catch (Throwable t) {
      // Ignore failure to load
      if(LOG.isDebugEnabled()) {
        LOG.debug("Failed to load native-hadoop with error: " + t);
        LOG.debug("java.library.path=" +
            System.getProperty("java.library.path"));
      }
    }

    if (!nativeCodeLoaded) {
      LOG.warn("Unable to load native-hadoop library for your platform... " +
               "using builtin-java classes where applicable");
    }
  }

  /**
   * Check if native-hadoop code is loaded for this platform.
   *
   * @return <code>true</code> if native-hadoop is loaded,
   *         else <code>false</code>
   */
  public static boolean isNativeCodeLoaded() {
    return nativeCodeLoaded;
  }

As you can see, the isNativeCodeLoaded method returns a property value, so where is the problem?

reason and solution

After analyzing the static constructor of the NativeCodeLoaded class, there is a System.loadLibrary("hadoop") method. Is it because of this method? By debugging on other colleagues' environments, System.loadLibrary("hadoop") will be abnormal, thus running the catch part, but my computer will not be abnormal and continue to run directly. So what is the purpose of the System.loadLibrary method. By analyzing the source code, we know that this method loads the environment variables of the local system and the user. Further analysis is caused by the fact that I have the hadoop.dll file in the C:\\Windows\System32 directory or the %Hadoop_Home%/bin directory is configured in the environment variable Path.

In short, because the hadoop.dll file exists in any directory of the configured system environment variable Path, it is considered to be a hadoop cluster environment, but the hadoop cluster does not support the window environment. Exception produced. The processing method is also very simple. Check each directory under the system environment variable Path to ensure that there is no hadoop.dll file.

If you delete a directory in the system environment variable Path, the usr_paths or sys_paths in the ClassLoader class will take effect only after restarting Intellij Idea.

created at:03-23-2022
edited at: 03-23-2022: