HDFS File Interface

Command basic format:

hadoop fs -cmd < args >

  1. ls

hadoop fs -ls /

List directories and files in the root directory of the hdfs file system

hadoop fs -ls -R /

List all directories and files in the hdfs filesystem

  1. put

hadoop fs -put < local file > < hdfs file >

The parent directory of hdfs file must exist, otherwise the command will not be executed

hadoop fs -put < local file or dir >...< hdfs dir >

hdfs dir must exist, otherwise the command will not be executed

hadoop fs -put - < hdsf file>

From the keyboard to the hdfs file, press Ctrl+D to end the input, hdfs file can not exist, otherwise the command will not be executed

  1. moveFromLocal

hadoop fs -moveFromLocal < local src > ... < hdfs dst >

Similar to put, the source file local src is deleted after the command is executed, or it can be read and entered from the keyboard into the hdfs file

  1. copyFromLocal

hadoop fs -copyFromLocal < local src > ... < hdfs dst >

Similar to put, you can also read input from the keyboard into the hdfs file

  1. get

hadoop fs -get < hdfs file > < local file or dir>

local file cannot have the same name as hdfs file. Otherwise, it will be prompted that the file already exists. Files without a rename will be copied to the local location.

hadoop fs -get < hdfs file or dir > ... < local dir >

When copying multiple files or directories to a local location, the local location is the folder path

Note: If the user is not root, the local path should be the path under the user folder, otherwise there will be permission problems.

  1. copyToLocal

hadoop fs -copyToLocal < local src > ... < hdfs dst >

Similar to get

  1. rm

hadoop fs -rm < hdfs file > ...

hadoop fs -rm -r < hdfs dir>...

Multiple files or directories can be deleted at a time

  1. mkdir

hadoop fs -mkdir < hdfs path>

A directory can only be built one level at a time. If the parent directory does not exist, using this command will cause an error.

hadoop fs -mkdir -p < hdfs path>

Create a directory that does not exist if the parent directory does not exist

  1. getmerge

hadoop fs -getmerge < hdfs dir > < local file >

After sorting all files in the specified directory of hdfs and merging them into the files specified local ly, the files will be created automatically when they do not exist, and the contents of the files will be overwritten when they exist.

hadoop fs -getmerge -nl < hdfs dir > < local file >

With nl, a line will be left between the hdfs files merged into the local file

  1. cp

hadoop fs -cp < hdfs file > < hdfs file >

The target file cannot exist, otherwise the command cannot be executed, which is equivalent to renaming and saving the file, and the source file still exists.

hadoop fs -cp < hdfs file or dir >... < hdfs dir >

Target folder must exist, otherwise commands cannot be executed

  1. mv

hadoop fs -mv < hdfs file > < hdfs file >

The target file cannot exist, otherwise the command cannot be executed, which is equivalent to renaming and saving the file, and the source file does not exist.

hadoop fs -mv < hdfs file or dir >... < hdfs dir >

When there are multiple source paths, the target path must be a directory and must exist.

Note: Moving across file systems (local ly to hdfs or vice versa) is not allowed

  1. count

hadoop fs -count < hdfs path >

Statistics of directory number, file number and total file size under the corresponding path of hdfs

Number of directories, number of files, total size of files, input path

HDFS JavaAPI

package Hdfs;

import com.sun.xml.internal.ws.api.ha.StickyFeature;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.hdfs.util.IOUtilsClient;
import org.apache.hadoop.io.IOUtils;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.text.SimpleDateFormat;
import java.util.Date;

public class HdfsTest {

    private static Configuration conf = new Configuration();

    static   {
        //Clusters deployed on remote servers
        conf.set("fs.defaultFS", "172.18.74.236:9000");

    }


    //create a new file
    public static void createFile(String dst, byte[] contents) throws IOException
    {
        FileSystem fs = FileSystem.get(conf);
        Path dstPath = new Path(dst);//Target path
        //Open an output stream
        FSDataOutputStream outputStream = fs.create(dstPath);
        outputStream.write(contents);
        outputStream.close();
        fs.close();
        System.out.println("File Creation Successful");
    }

    //Import local files into Hdfs
    public static void uploadFile(String src,String dst) throws IOException{
        //Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(conf);
        Path srcPath = new Path(src); //Local Upload File Path
        Path dstPath = new Path(dst); //hdfs target path
        //Call the file replication function of the file system. The previous parameter is whether to delete the original file, true is deleted, and false is the default.
        fs.copyFromLocalFile(false, srcPath, dstPath);

        //Print file path
        System.out.println("Upload to "+conf.get("fs.default.name"));
        System.out.println("------------list files------------"+"\n");
        FileStatus [] fileStatus = fs.listStatus(dstPath);
        for (FileStatus file : fileStatus)
        {
            System.out.println(file.getPath());
        }
        fs.close();
    }

    public static void upload(String src,String dst) throws IOException{
        FileSystem fs = FileSystem.get(conf);

        Path dstPath = new Path(dst); //hdfs target path
        FSDataOutputStream os = fs.create(dstPath);
        FileInputStream is = new FileInputStream(src);

        org.apache.commons.io.IOUtils.copy(is,os);
    }

    //File renaming
    public static void rename(String oldName, String newName) throws IOException{
        FileSystem fs = FileSystem.get(conf);

        Path oldPath = new Path(oldName);
        Path newPath = new Path(newName);
        boolean isok = fs.rename(oldPath, newPath);
        if(isok){
            System.out.println("rename ok!");

        }
        else {
            System.out.println("rename failure");
        }

        fs.close();
    }

    //Delete files
    public static void delete(String filePath) throws IOException{
        FileSystem fs = FileSystem.get(conf);

        Path path = new Path(filePath);
        boolean isok = fs.deleteOnExit(path);
        if(isok){
            System.out.println("delect ok");
        }
        else {

            System.out.println("delect failure");
        }

        fs.close();
    }

    //Create directories
    public static void mkdir(String path) throws IOException{

        FileSystem fs = FileSystem.get(conf);

        Path srcPath = new Path(path);
        boolean isok = fs.mkdirs(srcPath);
        if (isok){
            System.out.println("create " + path + " dir ok !");

        }
        else{
            System.out.println("create "+ path +" dir failure!");
        }
        fs.close();
    }

    //Read the contents of the file
    public static void readFile(String filePath) throws IOException{
        FileSystem fs = FileSystem.get(conf);

        Path fielPath = new Path(filePath);
        InputStream in =null;
        try{
            in =fs.open(fielPath);
            IOUtils.copyBytes(in, System.out, 4096, false);
        }
        finally {
            IOUtils.closeStream(in);
        }
    }

    /**
     * Traverse through all files in the established directory
     */
    public static void getDiretoryFromHdfs(String direPath){


        try {
            FileSystem fs = FileSystem.get(conf);
            FileStatus[] filelist = fs.listStatus(new Path(direPath));
            for (int i = 0; i < filelist.length; i++){
                System.out.println("______" + direPath + "All files in the directory_________");
                FileStatus fileStatus = filelist[i];
                System.out.println("Name: "+fileStatus.getPath().getName());
                System.out.println("Size: "+fileStatus.getLen());
                System.out.println("Path: "+ fileStatus.getPath());

            }
            fs.close();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
  
    public static void main(String[] args) throws IOException{

        String today = new SimpleDateFormat("yyyy-MM-dd").format(new Date());

        String localFilePath = "C:\\Users\\Charon\\Desktop\\To do.txt";
        String hdfsFilePath = "/Test" +today.substring(0,7) + "/upload_date=" + today + "/";


        //1. Traverse through all files in the root directory
        getDiretoryFromHdfs("/");

        //2. New Directory
//        mkdir(hdfsFilePath);

        //3. Upload files
//        uploadFile(localFilePath,hdfsFilePath);
//        getDiretoryFromHdfs(hdfsFilePath);

        //4. Read files
//        readFile("hdfs://172.18.74.236:9000/Test2019-05/upload_date=2019-05-26/To do.txt");

        //5. Rename
//        rename("hdfs://172.18.74.236:9000/Test2019-05/upload_date=2019-05-26/To do.txt","hdfs://172.18.74.236:9000/Test2019-05/upload_date=2019-05-26/Test.txt");

        //6. Create a file and write to it
//        Byte [] contents = " n " getBytes () is added at 20:22:37 on May 26, 2009;
//        createFile("hdfs://172.18.74.236:9000/Test2019-05/upload_date=2019-05-26/Test1.txt",contents);
//        readFile("hdfs://172.18.74.236:9000/Test2019-05/upload_date=2019-05-26/Test1.txt");

        //7. Delete files
        delect(hdfsFilePath);
    }

Tags: Hadoop Java Apache xml

Posted on Fri, 06 Sep 2019 07:08:35 -0700 by Wesf90