Package Details: hadoop 2.7.3-1

Git Clone URL: (read-only)
Package Base: hadoop
Description: Hadoop - MapReduce implementation and distributed filesystem
Upstream URL:
Licenses: Apache
Submitter: sjakub
Maintainer: severach
Last Packager: severach
Votes: 54
Popularity: 0.588319
First Submitted: 2009-04-07 16:39
Last Updated: 2016-08-28 15:56

Required by (2)

Sources (9)

Latest Comments

severach commented on 2016-09-13 19:07

I'm looking to save time for others, not myself. The problem is that xz is very useful in the repos where traffic reduction is worth any cost. xz is counter productive on the AUR.

Petron commented on 2016-09-13 03:40

Hi, I found your discussion about the PKGEXT.
But have you ever tried to compress the package in parallel?(by setup 'xz -T0' in /etc/makepkg.conf)

I got
.pkg.tar.xz: 530% cpu 26.731s
with CPU E5-2660 0 @ 2.20GHz

And I think it may take much less time on a i3/5/7 cpu

ael commented on 2016-07-18 08:47

`hadoop-jobtracker.service` make use of command `/usr/bin/hadoop jobtracker` but is deprecated. The ouput of that command suggest to use the new yarn command.

Spyhawk commented on 2016-02-19 19:36

@severach> I see, this makes sense. Guess I'll have to find a workaround here. This issue is directly related to the removal of the --pkg option of makepkg in pacman 5.0.0.

severach commented on 2016-02-19 18:22

time makepkg -scCf # E3-1245v1
.pkg.tar: 5 seconds 326MB
.pkg.tar.gz: 13 seconds 207MB
.pkg.tar.xz: 88 seconds 188MB

Saving 120MB is worth 8 seconds. Saving 20MB is not worth 75 seconds. I didn't measure decompression time. gz is so fast that it is often faster than not compressing.

Spyhawk commented on 2016-02-19 15:44

@jsivak> Thank for the report.
@severach> Actually, pacaur handles it correctly if the extension is manually changed to ".pkg.tar" inside the PKGBUILD. This only tars but does not compress the package. This is often used with very big packages to save installation time, such as wps-office for example.

I do not understand the rational behind using "pkg.tar.gz" instead of "pkg.tar" if the objective is to save time. The gain would be minimal compared to the default compression settings, and still much slower that no compression at all.

jsivak commented on 2016-02-19 13:17

The build worked; I'll submit the issue to the pacaur dev.


severach commented on 2016-02-19 10:49

If that works, file a bug with pacaur. pacaur can't handle packages made with PKGEXT.

jsivak commented on 2016-02-19 02:51

Edit: Just re-read the Changelog.. I think pacaur 4.5.3 fixes/addresses this issue


I'm using pacaur and am getting these messages when trying to install the package:
==> Checking for packaging issue...
==> WARNING: backup entry file not in package : etc/hadoop/fair-scheduler.xml
==> WARNING: backup entry file not in package : etc/hadoop/mapred-queue-acls.xml
==> WARNING: backup entry file not in package : etc/hadoop/mapred-site.xml
==> WARNING: backup entry file not in package : etc/hadoop/masters
==> WARNING: backup entry file not in package : etc/hadoop/taskcontroller.cfg
==> WARNING: backup entry file not in package : etc/hadoop/
==> Creating package "hadoop"...
-> Generating .PKGINFO file...
-> Generating .BUILDINFO file...
-> Adding install file...
-> Generating .MTREE file...
-> Compressing package...
==> Leaving fakeroot environment.
==> Finished making: hadoop 2.7.2-1 (Thu Feb 18 21:45:22 EST 2016)
==> Cleaning up...
:: Installing hadoop package(s)...
:: hadoop package(s) failed to install. Check .SRCINFO for mismatching data with PKGBUILD.

Any suggestions?


monksy commented on 2015-08-13 05:30

From starting the jobtracker and tasktracker services I'm getting the following error:

What has happened to those services?

sudo -u hadoop hadoop jobtracker
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

Sorry, the jobtracker command is no longer supported.
You may find similar functionality with the "yarn" shell command.
Usage: mapred [--config confdir] [--loglevel loglevel] COMMAND
where COMMAND is one of:
pipes run a Pipes job
job manipulate MapReduce jobs
queue get information regarding JobQueues
classpath prints the class path needed for running
mapreduce subcommands
historyserver run job history servers as a standalone daemon
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
hsadmin job history server admin interface

Most commands print help when invoked w/o parameters.

All comments