Package Details: hadoop 3.1.0-1

Git Clone URL: (read-only)
Package Base: hadoop
Description: Hadoop - MapReduce implementation and distributed filesystem
Upstream URL:
Licenses: Apache
Submitter: sjakub
Maintainer: severach (12eason)
Last Packager: severach
Votes: 72
Popularity: 0.674369
First Submitted: 2009-04-07 16:39
Last Updated: 2018-04-08 06:47

Latest Comments

dxxvi commented on 2017-06-07 05:08

How do I start this hadoop? I try:
sudo systemctl start hadoop-datanode hadoop-jobtracker hadoop-namenode hadoop-secondarynamenode hadoop-tasktracker
then check their status:
systemctl status hadoop-datanode hadoop-jobtracker hadoop-namenode hadoop-secondarynamenode hadoop-tasktracker
All of them failed. The jobtracker has this line:
Error: JAVA_HOME is not set and could not be found.
JAVA_HOME error:
Unable to start namenode and datanode: Hadoop ArchWiki to format a new distributed filesystem; for editing core-site.xml and hdfs-site.xml
jobtracker and tasktracker cannot start: running the commands in hadoop-jobtracker.service and hadoop-tasktracker.service under the hadoop account shows the reasons (12eason also mentioned that).

12eason commented on 2017-03-14 22:45

First thing, hdfs, mapred, container-executor, rcc and yarn all need to be linked to /usr/bin along with hadoop. Hdfs especially has a lot of the functions previously done by hadoop.

Secondly, the hadoop package provides shell scripts under sbin/ to start and stop instances and these would be less prone to breakage if used in the systemd scripts. As it is, many commands systemd uses are depreciated.

nmiculinic commented on 2017-03-11 17:47

There's mirror problems for hadoop:

==> Making package: hadoop 2.7.3-1 (Sat Mar 11 18:48:07 CET 2017)
==> Retrieving sources...
-> Downloading hadoop-2.7.3.tar.gz...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
Warning: Transient problem: HTTP error Will retry in 3 seconds. 3 retries
Warning: left.

flipflop97 commented on 2016-11-21 13:36

Can you symlink /usr/lib/hadoop/bin/mapred to /usr/bin/mapred

severach commented on 2016-09-13 19:07

I'm looking to save time for others, not myself. The problem is that xz is very useful in the repos where traffic reduction is worth any cost. xz is counter productive on the AUR.

petronny commented on 2016-09-13 03:40

Hi, I found your discussion about the PKGEXT.
But have you ever tried to compress the package in parallel?(by setup 'xz -T0' in /etc/makepkg.conf)

I got
.pkg.tar.xz: 530% cpu 26.731s
with CPU E5-2660 0 @ 2.20GHz

And I think it may take much less time on a i3/5/7 cpu

ael commented on 2016-07-18 08:47

`hadoop-jobtracker.service` make use of command `/usr/bin/hadoop jobtracker` but is deprecated. The ouput of that command suggest to use the new yarn command.

Spyhawk commented on 2016-02-19 19:36

@severach> I see, this makes sense. Guess I'll have to find a workaround here. This issue is directly related to the removal of the --pkg option of makepkg in pacman 5.0.0.

severach commented on 2016-02-19 18:22

time makepkg -scCf # E3-1245v1
.pkg.tar: 5 seconds 326MB
.pkg.tar.gz: 13 seconds 207MB
.pkg.tar.xz: 88 seconds 188MB

Saving 120MB is worth 8 seconds. Saving 20MB is not worth 75 seconds. I didn't measure decompression time. gz is so fast that it is often faster than not compressing.

Spyhawk commented on 2016-02-19 15:44

@jsivak> Thank for the report.
@severach> Actually, pacaur handles it correctly if the extension is manually changed to ".pkg.tar" inside the PKGBUILD. This only tars but does not compress the package. This is often used with very big packages to save installation time, such as wps-office for example.

I do not understand the rational behind using "pkg.tar.gz" instead of "pkg.tar" if the objective is to save time. The gain would be minimal compared to the default compression settings, and still much slower that no compression at all.

jsivak commented on 2016-02-19 13:17

The build worked; I'll submit the issue to the pacaur dev.


severach commented on 2016-02-19 10:49

If that works, file a bug with pacaur. pacaur can't handle packages made with PKGEXT.

jsivak commented on 2016-02-19 02:51

Edit: Just re-read the Changelog.. I think pacaur 4.5.3 fixes/addresses this issue


I'm using pacaur and am getting these messages when trying to install the package:
==> Checking for packaging issue...
==> WARNING: backup entry file not in package : etc/hadoop/fair-scheduler.xml
==> WARNING: backup entry file not in package : etc/hadoop/mapred-queue-acls.xml
==> WARNING: backup entry file not in package : etc/hadoop/mapred-site.xml
==> WARNING: backup entry file not in package : etc/hadoop/masters
==> WARNING: backup entry file not in package : etc/hadoop/taskcontroller.cfg
==> WARNING: backup entry file not in package : etc/hadoop/
==> Creating package "hadoop"...
-> Generating .PKGINFO file...
-> Generating .BUILDINFO file...
-> Adding install file...
-> Generating .MTREE file...
-> Compressing package...
==> Leaving fakeroot environment.
==> Finished making: hadoop 2.7.2-1 (Thu Feb 18 21:45:22 EST 2016)
==> Cleaning up...
:: Installing hadoop package(s)...
:: hadoop package(s) failed to install. Check .SRCINFO for mismatching data with PKGBUILD.

Any suggestions?


monksy commented on 2015-08-13 05:30

From starting the jobtracker and tasktracker services I'm getting the following error:

What has happened to those services?

sudo -u hadoop hadoop jobtracker
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

Sorry, the jobtracker command is no longer supported.
You may find similar functionality with the "yarn" shell command.
Usage: mapred [--config confdir] [--loglevel loglevel] COMMAND
where COMMAND is one of:
pipes run a Pipes job
job manipulate MapReduce jobs
queue get information regarding JobQueues
classpath prints the class path needed for running
mapreduce subcommands
historyserver run job history servers as a standalone daemon
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
hsadmin job history server admin interface

Most commands print help when invoked w/o parameters.

Jesse2004 commented on 2014-11-15 01:40

@roheim Hi, thanks for fixing the /etc issue! Now there's another problem: I installed the package but get a "Error: JAVA_HOME is not set and could not be found" when the hadoop command is run. It seems that JAVA_HOME is no longer set by /etc/profile.d/ now.

skywalker1993 commented on 2014-11-02 13:52

the url is not found, so as the others in pkgbuild file.

roheim commented on 2014-10-29 02:25

@confusedfla: done

confusedfla commented on 2014-10-29 02:10

you could add polkit as dependency - it is needed for the systemd services :)

roheim commented on 2014-09-29 10:49

I will do the change ASAP.

contradictioned commented on 2014-09-24 13:08

@Jesse2004: In my work with hadoop I formed the habit of having folders like


etc. Such that you can easily mount different alternative configurations. This carried over to this package.

jakebailey commented on 2014-09-24 12:54

@Jesse2004: There isn't a reason for it. It shouldn't be there, but the package maintainer hasn't fixed it (read their comment after mine after I mentioned this 8 months ago).

Jesse2004 commented on 2014-09-24 12:51

Why is the cofiguration files located at /etc/hadoop/hadoop/* but not /etc/hadoop/*? What is the extra level for?

roheim commented on 2014-01-19 09:52

@zikaeroh: I have a src file ready for upload that fixes your problem. But the I am getting an error when uploading.

jakebailey commented on 2014-01-13 06:52

Ignore the comment about the /lib/hadoop directory, it's just a symlink. Should have checked that beforehand.

However, moving everything from /etc/hadoop/hadoop to /etc/hadoop fixes all the issues I have with running pig with hadoop (as opposed to the local mode).

jakebailey commented on 2014-01-13 06:28

According to the wiki (and the environment variables the package sets), all of the configs and other directories should be at /etc/hadoop/, but it's actually at /etc/hadoop/hadoop/. Should this be different?

Also, there are two hadoop folders in /lib/, one as "hadoop" and the other as "hadoop-2.20". Is that intentional?

contradictioned commented on 2013-05-24 11:00

Indeed the problem was an old pacman. Now installation works again.

Regarding the second hint: Serious cleanup would be nice, I think a major rewrite even better. But I'm not quite sure how much this would break old installations on updates.

xgdgsc commented on 2013-05-24 03:32

Please follow the suggestions here:
I think you may not be using the latest pacman.

contradictioned commented on 2013-05-22 19:05

Hi, sorry but i can not confirm the "permission denied" Problem.
Directory /usr/lib/hadoop-1.1.2 is owned and only writable by 'root', but the installation works (tested on a snapshot of a fresh arch installation in a vm).

Anonymous comment on 2013-05-22 13:17

==> Starting package()...
cp: cannot create directory ‘/usr/lib/hadoop-1.1.2’: Permission denied

xgdgsc commented on 2013-05-21 01:05

Sorry, clicked at the wrong place.

xgdgsc commented on 2013-05-21 01:03

cp: cannot create directory ‘/usr/lib/hadoop-1.1.2’: Permission denied

Anonymous comment on 2013-05-08 19:03

Current PKGFILE fails with:

==> Starting package()...
cp: cannot create directory ‘/usr/lib/hadoop-1.1.2’: Permission denied

Anonymous comment on 2013-04-05 08:04

Please update this package to 1.1.2

contradictioned commented on 2013-02-12 16:25

So far... I updated the package to 1.1.1, added systemd support (so much thanks to MarkusH!) and added apache-ant as dependency.

If you have more input for me, please ask :)

contradictioned commented on 2013-02-11 21:39

MarkusH: Looks nice, but you seem to have forgotten to update the MD5 sum of your conf.diff

Others: Sorry for forgetting my duty here. I'm on it right now.

MarkusH commented on 2013-02-06 22:16

I added a bunch of improvements regarding the Systemd unit files. See

MarkusH commented on 2013-02-06 22:16

I added a bunch of improvements regarding the Systemd unit files. See the link in my previous comment.

MarkusH commented on 2013-02-03 20:26

Hi, I updated the package and added systemd support. Would be nice if you could check it out and report problems:

contradictioned commented on 2012-12-05 11:02

@EiyuuZack: I'm gonna update that.
(And for current reasons I will get systemd compatibility.)

Anonymous comment on 2012-12-04 09:17

New 1.1.1 release is out, any chance you can update it? Also the current source link is dead.

ytj commented on 2012-10-13 17:51

@alperkanat Could you explain why should I place the binaries into /usr/share/hadoop/bin rather than /usr/bin?

karabaja4 commented on 2012-04-01 02:30

dodolee, and others having similar problems, try adding options=(!strip) to the PKGBUILD. This seems an issue only with i686 builds.

Anonymous comment on 2012-01-27 14:06

The following error occurs while running makepkg:
==> Tidying install...
-> Purging unwanted files...
-> Compressing man and info pages...
-> Stripping unneeded symbols from binaries and libraries...
strip:./usr/share/hadoop/bin/task-controller: File format not recognized
==> ERROR: Makepkg was unable to build hadoop.

cgueret commented on 2012-01-18 14:28

Please replace the jre/jdk dependencies by "java-runtime"/"java-environment"

karabaja4 commented on 2012-01-09 16:04

Adopted and updated.

krevedko commented on 2011-08-16 13:50

--- PKGBUILD 2010-09-02 08:03:56.000000000 +0400
+++ PKGBUILD.fixed 2011-08-16 17:48:42.000000000 +0400
@@ -13,7 +13,7 @@ makedepends=('jdk' 'apache-ant')


krevedko commented on 2011-08-16 12:47
403: Forbidden.

alperkanat commented on 2011-04-18 11:50

in fact you're right. hadoop project doesn't seem to provide a redirected (generic) url that will direct you to the closest mirror. so in this case, you may prefer a more central server instead of a US one.

alperkanat commented on 2011-04-18 11:49

how about this one:

sjakub commented on 2011-04-17 22:51

And what url would you prefer?

alperkanat commented on 2011-04-17 22:51

why do you prefer to place the binaries into /usr/share/hadoop/bin rather than /usr/bin or at least /usr/local/bin ?

alperkanat commented on 2011-04-17 22:47

can you please change the package url to a more generic link since it's very slow for european and asian users?

nitralime commented on 2010-12-19 13:53

Isn't it better to change the dependency on "lzo" to "lzo2"?