Previously, I implemented the use of streams to download http and ftp files to the local area, and also implemented the upload of local files to hdfs. Now I can do it.
The ftp and http files are transferred to HDFS, and there is no need to copy the ftp and http files to locally and then upload them to HDFS. In fact, the principle of this thing is very simple. It is to use a stream to read ftp or http files into the stream, and then transfer the contents of the stream to HDFS, so that the data does not need to be stored on the local hard disk, just Let the memory complete the transfer process. I hope this tool can help students with such needs~
Here are the links to the previous tools:
http tool
ftp tool link description
The code is as follows:
import java.io.InputStream;import java.io.OutputStream;import java.io.IOException;public class FileTrans { private String head = ""; private String hostname = ""; private String FilePath = ""; private String hdfsFilePath = ""; private HDFSUtil hdfsutil = null; private FtpClient ftp; private HttpUtil http; public void setFilePath(String FilePath){ this.FilePath = FilePath; } public String getFilePath(String FilePath){ return this.FilePath; } public void setFilePath( String hdfsFilePath){ this.hdfsFilePath = hdfsFilePath; } public String getHostName(String hdfsFilePath){ return this.hdfsFilePath; } public void setHostName(String hostname){ this.hostname = hostname; } public String getHostName(){ return this.hostname ; } public void setHead(String head){ this.head = head; } public String getHead(){ return this.head; } public FileTrans(String head, String hostname, String filepath, String hdfsnode,String hdfsFilepath){ this. head = head; this.hostname = hostname; this.FilePath = filepath; this.hdfsFilePath = hdfsFilepath; if (head.equals("ftp") && hostname != ""){ this.ftp = new FtpClient(this.hostname ); } if ((head.equals("http") || head .equals("https")) && hostname != ""){ String httpurl = head + "://" + hostname + "/" + filepath; this.http = new HttpUtil(httpurl); } if (hdfsnode != ""){ this.hdfsutil = new HDFSUtil(hdfsnode); } this.hdfsutil.setHdfsPath(this.hdfsFilePath); this.hdfsutil.setFilePath( hdfsutil.getHdfsNode()+hdfsutil.getHdfsPath()); this.hdfsutil.setHadoopSite("./hadoop-site.xml"); this.hdfsutil.setHadoopDefault("./hadoop-default.xml"); this.hdfsutil.setHadoopDefault("./hadoop-default.xml"); this.hdfsutil .setConfigure(false); } public static void main(String[] args) throws IOException{ String head = ""; String hostname = ""; String filepath = ""; String hdfsfilepath = ""; String hdfsnode = ""; String localpath = ""; InputStream inStream = null; int samples = 0; try{ head = args[0]; //Remote server type, http or ftp hostname = args[1]; //Remote server hostname filepath = args[2]; //Remote file path hdfsnode = args[3]; //Hdfs machine name, without starting with hdfs hdfsfilepath = args[4]; //Hdfs file path localpath = args[5]; / /If you need to save a copy locally, enter the local path, do not save, pass in spaces or samples in 0 samples = Integer.parseInt(args[6]); //If you save locally, save N lines before, If not saved, fill in 0 }catch (Exception e){ System.out.println("[FileTrans]:input args error!"); e.printStackTrace(); } FileTrans filetrans = new FileTrans(head, hostname, filepath, hdfsnode,hdfsfilepath); if (filetrans == null){ System.out.println("filetrans null"); return; } if (filetrans.ftp == null && head.equals("ftp")){ System.out .println("filetrans ftp null"); return; } if (filetrans.http == null && (head.equals("http") || head.equals("https"))){ System.out.println( "filetrans ftp null"); return; } try{ if (head.equals("ftp")){ inStream = filetrans.ftp.getStream(filepath); if (samplelines > 0){ filetrans.ftp.writeStream(inStream, localpath, samples); } } else{ inStream = filetrans.http.getStream(head + "://" + hostname + "/" + filepath); if (samplelines > 0){ filetrans.http.downLoad(head + " ://" + hostname + "/" + filepath, localpath, samples); } } filetrans.hdfsutil.upLoad(inStream, filetrans.hdfsutil.getFilePath()); if (head == "ftp"){ filetrans.ftp .disconnect(); } }catch (IOException e){ System.out.println("[FileTrans]: file trans failed!"); e.printStackTrace(); } System.out.println("[FileTrans]: file trans success!"); }}If there is any problem with the compilation, it is mentioned in the article on the Hadoop tool. Please refer to the note: It is best to place the files of the other three tools in the same directory. If they are not placed together, please quote them yourself.
This tool can transfer ftp or http to hdfs, and save the first N lines locally for analysis
The above is all the content described in this article. I hope it will be helpful to everyone to learn Java.
Please take some time to share the article with your friends or leave a comment. We will sincerely thank you for your support!