0%

虚拟机Ubuntu16安装Hadoop记录

之前安装过但是没有形成文档,这次在虚拟机上重新安装,写个文档记录一下。

JDK的安装

  1. jdk获取(网上找的资源,官网比较慢,近期还需要登录): 百度网盘,下载后并解压至 你想要安装的目录。

    1
    tar -zxvf jdk-8u161-linux-x64.tar.gz

    image-20190902153125453

  2. 环境变量配置(在~/.bashrc 最下面添加)

    1
    2
    3
    4
    #set java evironment 
    export JAVA_HOME=/home/parallels/app/jdk1.8.0_161
    export CLASSPATH=.$JAVA_HOME/lib/dt.jar:JAVA_HOME/lib/tools.jar
    export PATH=$JAVA_HOME/bin:$PATH

    注意:JAVA_HOME需要改成解压后所在的目录

    image-20190902161002701

    注意:如果无法编辑,记得前面加sudo, 记得保存wq

  3. 配置生效

    1
    source ~/.bashrc
  4. 检验

    1
    java -version

    image-20190902161224672

    返回如上结果,JDK安装完毕。

Hadoop的安装

  1. 安装包获取

    官网下载对应的安装包,然后解压到本地目录下。

  2. 设置环境变量(~/.bashrc)

    1
    2
    export HADOOP_HOME=/home/parallels/app/hadoop-3.1.2
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

    image-20190902163351443

    1
    source ~/.bashrc  ## source一下,使之生效。
  3. 配置(../etc/hadoop/)

    1. core-site.xml

      1
      2
      3
      4
      5
      6
      <configuration>
      <property>
      <name>fs.defaultFS</name>
      <value>hdfs://localhost:9000</value>
      </property>
      </configuration>
    2. hdfs-site.xml

      1
      2
      3
      4
      5
      6
      <configuration>
      <property>
      <name>dfs.replication</name>
      <value>1</value>
      </property>
      </configuration>
    3. hadoop-env.sh

      在尾部设置JAVA_HOME

      1
      export JAVA_HOME=/home/parallels/app/jdk1.8.0_161
    4. 初始格式化(仅需运行一次)

      1
      hadoop namenode -format
  4. 启动

    1
    ./sbin/start-dfs.sh
    1. 提示出错(没有则跳过)

      image-20190902165318600

      解决:

      1
      ps -e|grep ssh

      没有返回,则可能没有安装ssh, 则安装ssh

      1
      sudo apt-get install openssh-server
    2. 提示出错(没有则跳过)

      1
      2
      3
      4
      5
      6
      7
      8
      9
      Starting namenodes on [localhost]
      ERROR: Attempting to operate on hdfs namenode as root
      ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
      Starting datanodes
      ERROR: Attempting to operate on hdfs datanode as root
      ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
      Starting secondary namenodes [account.jetbrains.com]
      ERROR: Attempting to operate on hdfs secondarynamenode as root
      ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.

      解决:

      添加如下

      在start-dfs.sh中:

      1
      2
      3
      4
      HDFS_DATANODE_USER=root
      HADOOP_SECURE_DN_USER=hdfs
      HDFS_NAMENODE_USER=root
      HDFS_SECONDARYNAMENODE_USER=root

      在start-yarn.sh中:

      1
      2
      3
      YARN_RESOURCEMANAGER_USER=root
      HADOOP_SECURE_DN_USER=yarn
      YARN_NODEMANAGER_USER=root
    3. 提示出错(没有则跳过)

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
      Starting namenodes on [localhost]
      /etc/profile: line 30: Shared: command not found
      localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
      localhost: Permission denied (publickey,password).
      Starting datanodes
      /etc/profile: line 30: Shared: command not found
      localhost: Permission denied (publickey,password).
      Starting secondary namenodes [account.jetbrains.com]
      /etc/profile: line 30: Shared: command not found
      account.jetbrains.com: Warning: Permanently added 'account.jetbrains.com,0.0.0.0' (ECDSA) to the list of known hosts.
      account.jetbrains.com: Permission denied (publickey,password).

      解决:

      此时 ssh localhost也会失败,原因是秘钥没有给自己。

      在~目录的.ssh下生成秘钥

      1
      ssh-keygen -t rsa -P ""
      1
      ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub 用户名@localhost

    启动成功后显示

    image-20190904085030339

HDFS shell操作

  • 在centos 中创建 test.txt

    1
    touch test.txt
  • 在centos中为test.txt 添加文本内容

    1
    vi test.txt
  • 在HDFS中创建 hadoop001/test 文件夹

    1
    hadoop fs -mkdir -p /hadoop001/test
  • 把text.txt文件上传到HDFS中

    1
    hadoop fs -put test.txt /hadoop001/test/
  • 查看hdfs中 hadoop001/test/test.txt 文件内容

    1
    hadoop fs -cat /hadoop001/test/test.txt
  • 将hdfs中 hadoop001/test/test.txt文件下载到centos

    1
    hadoop fs -get /hadoop001/test/test.txt test.txt
  • 删除HDFS中 hadoop001/test/

    hadoop fs -rm -r /hadoop001

页面访问

因为hadoop用的是3.1版本

对应的web访问链接是

  • http://localhost:8088

    image-20190904104242706

  • http://localhost:9870

    image-20190904104301551

端口不是50070

坑点

我配置mysql、hive后,重启一次,发现namenode没有启动起来,http://localhost:9870无法访问,后来在hdfs-site.xml 在 configuration节点中添加

1
2
3
4
5
6
7
8
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/app/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/app/tmp/dfs/data</value>
</property>

然后有运行了一下

1
hadoop namenode -format

重新启动才好。但是在启动日志里没有发现异常,坑点。

参考&致谢

安装hadoop3.0版本踩坑

localhost: Permission denied

Ubuntu下搭建hadoop出现Permission denied

Ubuntu16.04下Hadoop3.1的安装与配置https://blog.csdn.net/qq_35614920/article/details/80526376)

觉得不错?