summarylogtreecommitdiffstats
diff options
context:
space:
mode:
authorLev Levitsky2015-06-14 15:55:04 +0300
committerLev Levitsky2015-06-14 15:55:04 +0300
commit72a36dec00eb5fc9065856f35905baa4b1027776 (patch)
tree13bef12f918f40254e23f8567bd8da7f64f1e2b2
downloadaur-72a36dec00eb5fc9065856f35905baa4b1027776.tar.gz
Initial import
-rw-r--r--.SRCINFO20
-rw-r--r--CHANGELOG544
-rw-r--r--PKGBUILD22
3 files changed, 586 insertions, 0 deletions
diff --git a/.SRCINFO b/.SRCINFO
new file mode 100644
index 000000000000..7af7e9639d9a
--- /dev/null
+++ b/.SRCINFO
@@ -0,0 +1,20 @@
+# Generated by makepkg 4.2.1
+# Wed Apr 15 22:17:40 UTC 2015
+pkgbase = python2-pyteomics
+ pkgdesc = A framework for proteomics data analysis.
+ pkgver = 3.0.1
+ pkgrel = 1
+ url = http://pythonhosted.org/pyteomics
+ changelog = CHANGELOG
+ arch = any
+ license = Apache
+ depends = python2
+ optdepends = python2-matplotlib: for pylab_aux module, optional
+ optdepends = python2-lxml: for XML parsing modules, recommended
+ optdepends = python2-numpy: for lots of features, highly recommended
+ options = !emptydirs
+ source = https://pypi.python.org/packages/source/p/pyteomics/pyteomics-3.0.1.tar.gz
+ md5sums = 2c838cc1c16dce69148662b883c755b9
+
+pkgname = python2-pyteomics
+
diff --git a/CHANGELOG b/CHANGELOG
new file mode 100644
index 000000000000..a9f2bb997e99
--- /dev/null
+++ b/CHANGELOG
@@ -0,0 +1,544 @@
+3.0.1
+-----
+
+ - Added `legend_kwargs` as a keyword argument to
+ :py:func:`pyteomics.pylab_aux.scatter_trend`.
+
+ - Minor fixes.
+
+3.0.0
+-----
+ - XML parsers are now implemented as objects, each format has its own class.
+ Those classes can be instantiated using the same arguments as :py:func:`read`
+ functions accepted, and support direct iteration and the ``with`` syntax.
+ The :py:func:`read` functions are now simple aliases to the corresponding
+ constructors.
+
+ - As a result, functions :py:func:`iterfind`, :py:func:`version_info` and
+ :py:func:`get_by_id` functions are now deprecated in favor of methods
+ :py:meth:`iterfind` and :py:meth:`get_by_id` and attribute
+ :py:attr:`version_info` of corresponding instances.
+
+ - In :py:func:`pyteomics.mgf.write`, the order of keys and the format of values
+ are now controlled via module-level variables.
+
+ - In :py:mod:`pyteomics.electrochem`, correction for pK of terminal groups
+ depending on the terminal residue is implemented; example set of pK and
+ corrected pK added.
+
+ - Imports of external dependencies are delayed where possible, so that
+ unnecessary :py:exc:`ImportErrors` do not occur.
+
+ - :py:func:`local_fdr` renamed to :py:func:`qvalues` in :py:mod:`pepxml`,
+ :py:mod:`mzid`, :py:mod:`tandem` and :py:mod:`auxiliary`.
+ :py:func:`local_fdr` did not reflect the semantics of the function.
+ The algorithm has been also corrected so that the array of q-values
+ is always sorted (as it should be by definition).
+
+ - :py:func:`qvalues` now also accepts a parameter `full_output` which keeps the
+ PSMs alongside their scores and associated q-values.
+
+ - All :py:func:`fdr`, :py:func:`qvalues`, and :py:func:`!filter` functions
+ now accept a new parameter `correction`. It is used for more accurate
+ estimation of the number of false positives using TDA (paper with explanation
+ submitted to JPR).
+
+ - :py:func:`!filter` functions now support both iterator protocol and context
+ manager protocol. They now also accept the `full_output` parameter, which has
+ the following meaning: if :py:const:`True` (default), then an array of PSMs
+ is directly returned by the function. Otherwise, an iterator is returned, as
+ before. The array takes some memory, but this way is usually around 2x faster.
+
+ - New function :py:func:`pyteomics.pylab_aux.plot_qvalue_curve`.
+
+ - :py:class:`pyteomics.mass.Composition` objects now have a :py:meth:`mass`
+ method (equivalent to :py:func:`pyteomics.mass.calculate_mass`.
+
+ - Also, :py:class:`Composition` and objects returned by
+ :py:func:`pyteomics.parser.amino_acid_composition` now inherit from
+ :py:class:`collections.defaultdict` **and** :py:class:`collections.Counter`.
+
+ - Decoy-related functions in :py:mod:`pyteomics.fasta` now accept a new parameter
+ `keep_nterm` that preserves the N-terminal residue in the generated decoy
+ sequences.
+
+ - Minor fixes.
+
+API changes
+...........
+
+ - In :py:func:`pyteomics.pylab_aux.scatter_trend`, keyword arguments for
+ :py:func:`pylab.scatter` and :py:func:`pylab.plot` are now accepted as dicts
+ `scatter_kwargs` and `plot_kwargs`. Keyword argument `alpha` is now not
+ accepted and should be put in the appropriate dict.
+ - In :py:func:`pyteomics.pylab_aux.plot_function_3d` and
+ :py:func:`pyteomics.pylab_aux.plot_function_contour`, arbitrary kwargs can
+ now also be passed to the plotting function.
+ - :py:func:`!filter` functions do not support context manager protocol by
+ default. To keep using them as iterators / context managers, specify
+ ``full_output=False`` (see above for details).
+
+2.5.5
+-----
+
+Fix for a memory leak in :py:func:`pyteomics.mzid.get_by_id`, which affects
+:py:func:`pyteomics.mzid.read` with ``retrieve_refs=True``.
+
+2.5.4
+-----
+
+ - New functions :py:func:`local_fdr` in :py:mod:`pepxml`, :py:mod:`mzid`, and
+ :py:mod:`tandem`. The function returns a NumPy array with PSM scores and
+ corresponding values of local FDR.
+
+ - New parameter `iterative` in :py:func:`read` functions of XML parsing
+ modules. Parsing of mzIdentML files with ``retrieve_refs=True`` got
+ significantly faster.
+
+2.5.3
+-----
+
+ - Universally applicable modifications are now allowed in
+ :py:func:`pyteomics.parser.isoforms`.
+ - It is now also possible to specify non-terminal modifications which are
+ only applicable to terminal residues.
+ - Fix in :py:func:`pyteomics.parser.parse`: if the `labels` argument is
+ provided, it needs to contain standard terminal groups if they are present
+ in the sequence or if `show_unmodified_termini` is set to :py:const:`True`.
+ - :py:class:`pyteomics.mass.Composition` instances are now pickleable.
+ - Performance improvements.
+
+2.5.2
+-----
+
+ - New parameter `reverse` in all :py:func:`!filter` functions.
+ - New function :py:func:`pyteomics.mass.fast_mass2`, which is analogous to
+ :py:func:`pyteomicsmass.fast_mass`, but supports full *modX* notation and
+ is several times slower.
+ - Fix in :py:func:`pyteomics.pepxml.read` for compatibility with files
+ produced with Mascot2XML utility.
+ - Unknown labels now allowed in :py:mod:`pyteomics.electrochem` and
+ :py:mod:`pyteomics.achrom` functions in accordance with new general policy.
+
+2.5.1
+-----
+
+ - Bugfixes in :py:func:`pyteomics.parser.isoforms`:
+ - handling of the `labels` argument is now in accordance with new policy
+ - solved memory problems when using `max_mods`
+ - :py:func:`pyteomics.parser.cleave` does not require a valid *modX* sequence
+ by default.
+
+2.5.0
+-----
+
+ - :py:func:`pyteomics.parser.amino_acid_composition` now accepts "split"
+ parsed sequences.
+
+ - Cleavage rules in :py:data:`pyteomics.parser.expasy_rules` updated.
+
+ - Helper function :py:func:`pyteomics.parser.num_sites` counts the number
+ of cleavage sites in a sequence.
+
+ - Helper function :py:func:`pyteomics.parser.match_modX` does essentially
+ the same as :py:func:`pyteomics.parser.is_modX`, but returns a
+ :py:class:`re.match` object or :py:const:`None` instead of a :py:class:`bool`.
+
+ - Bugfix in :py:func:`pyteomics.auxiliary.filter`, which didn't work correctly
+ with iterators.
+
+ - Added a new parameter ``max_mods`` in :py:func:`pyteomics.parser.isoforms`.
+
+API changes
+...........
+
+ - The boolean ``overlap`` parameter in :py:func:`pyteomics.parser.cleave` is
+ replaced with an integer ``min_length``. Since ``min_length`` uses
+ :py:func:`pyteomics.parser.length`, the ``labels`` keyword argument is now
+ accepted by :py:func:`cleave` and :py:func:`num_sites`, if needed. With
+ carefully designed cleavage rules, all cleavage functions work
+ with *modX* sequences.
+
+ - The ``labels`` argument in :py:func:`pyteomics.parser.parse` and related
+ functions has changed its meaning. :py:func:`parse` won't raise an exception
+ for non-standard labels in sequences if the ``labels`` keyword argument is
+ not given.
+
+ - The *modX* notation specification is now more strict to avoid ambiguity:
+ only zero or two terminal groups can be present in a *modX* sequence.
+ Sequences with one terminal group specified will be supported where possible,
+ but be advised that sequences such as "H-OH" are intrinsically ambiguous.
+
+2.4.3
+-----
+
+ - Added the ``ratio`` keyword argument for FDR calculation.
+
+ - Minor changes in :py:func:`iterfind` functions of file parsers.
+
+ - Bugfix in :py:func:`pyteomics.mgf.write` (duplication of pepmass key).
+
+ - Removed non-functional parameter ``read_schema`` for
+ :py:func:`pyteomics.tandem.read`.
+
+2.4.2
+-----
+
+ - Bugfix in :py:func:`pyteomics.mass.most_probable_isotopic_composition`.
+ The bug manifested itself after version **2.4.0**, when
+ :py:data:`pyteomics.mass.nist_mass` was expanded. Also, the format of the
+ returned value is now in accordance with the documentation.
+
+2.4.1
+-----
+
+ - New function :py:func:`pyteomics.auxiliary.filter` for filtering lists
+ of PSMs not coming directly from files in supported formats.
+
+ - Also, a format-agnostic helper function :py:func:`pyteomics.auxiliary.fdr`.
+
+2.4.0
+-----
+
+ - New functions for filtering to a certain FDR level based on target-decoy
+ strategy, as well as for FDR estimation, in :py:mod:`pyteomics.tandem`,
+ :py:mod:`pyteomics.pepxml` and :py:mod:`pyteomics.mzid`. The functions are
+ called :py:func:`!filter` (beware of shadowing the built-in function) and
+ :py:func:`fdr` (in each of the modules). Chained versions
+ :py:func:`filter.chain` and :py:func:`filter.chain.from_iterable` are
+ also available. See `Data Access <data.html#general-notes>`_ for more info.
+
+ - New function :py:func:`pyteomics.parser.coverage` for sequence coverage
+ calculation.
+
+ - New function :py:func:`pyteomics.fasta.decoy_chain`, a chained version of
+ :py:func:`pyteomics.fasta.decoy_db`.
+
+ - New elements in :py:data:`pyteomics.mass.nist_mass`. Pretty much all elements
+ are there now.
+
+ - Fix in :py:func:`pyteomics.parser.parse` to cover some fancy corner cases.
+
+ - Bugfix in :py:mod:`pyteomics.tandem`: modification info is now fully extracted.
+
+ - :py:func:`pyteomics.mass.isotopic_composition_abundance` is now able to
+ calculate abundances for larger molecules.
+
+ .. note::
+ Rounding errors may be significant in this case.
+
+2.3.0
+-----
+
+ - New parameter "read_schema" in :py:func:`read` functions of XML parsing modules.
+ When set to :py:const:`False`, disables the attempts to fetch an auxiliary file
+ and obtain structure information about the file being parsed.
+
+ - New function :py:func:`chain` in all modules that have a :py:func:`read`
+ function, for convenient chaining of multiple files. :py:func:`chain` only
+ works as a context manager. Use :py:func:`itertools.chain` in other cases.
+ The ``chain.from_iterable`` form is also available as a context manager.
+
+ - New function :py:func:`pyteomics.auxiliary.print_tree` for exploration of
+ complex nested dicts produced by XML parsers.
+
+ - New sets of retention coefficients in :py:mod:`pyteomics.achrom`.
+
+ - Bugfix in :py:mod:`pyteomics.pepxml`. The bug caused an exception when parsing
+ some pepXML files.
+
+ - The output of :py:func:`pyteomics.mgf.read` now always contains a masked
+ array of charges.
+
+ - Other minor fixes.
+
+API change
+..........
+
+ - In :py:func:`pyteomics.mgf.read` the precursor charge is now always represented
+ by a list of ints (a :py:class:`ChargeList` object).
+
+2.2.2
+-----
+
+ - Bugfix in :py:mod:`pyteomics.tandem`. The info about all proteins is now
+ extracted.
+
+2.2.1
+-----
+
+ - Update parsers for FASTA headers.
+
+ - NamedTuple for FASTA entries is now defined globally, which should solve
+ pickling problems.
+
+2.2.0
+-----
+
+ - New module :py:mod:`pyteomics.tandem` for reading output files of X!Tandem
+ search engine.
+
+2.1.6
+-----
+
+ - Fix in :py:mod:`pyteomics.pepxml`. pepXML files generated by TPP are now
+ processed without errors.
+
+
+2.1.5
+-----
+
+ - Fix in :py:mod:`pyteomics.pepxml`. 'modified_peptide' is now always available.
+
+ - Fix in :py:mod:`pyteomics.mass` (issue #2 in the bug tracker).
+
+ - Improved arithmetics for :py:class:`Composition` objects.
+
+2.1.4
+-----
+
+ - In :py:mod:`fasta`, :py:func:`decoy_db` now doesn't write to file, but returns
+ an iterator over FASTA records. The old :py:func:`decoy_db` is now called
+ :py:func:`write_decoy_db`, which is equivalent to :py:func:`decoy_db` combined
+ with :py:func:`write`.
+
+Bugfixes:
+
+ - In :py:func:`pyteomics.mgf.read`, the charges, if present, are returned as a
+ masked array now. Previously, an exception occurred if charges were missing
+ for some of the fragments.
+
+ - Values in :py:data:`mass.nist_mass` corrected.
+
+ - Other minor corrections.
+
+2.1.3
+-----
+
+ - Adjust the behavior affected by the bug fixed in 2.1.2. `name` attributes
+ of `<cvParam>` elements in the absence of `value` attributes are now collected
+ in a list under the `'name'` key.
+
+ - Add support for overlapping matches in :py:func:`parser.cleave`.
+
+2.1.2
+-----
+
+ - Bugfix in XML parsers. The bug caused the mzML parser break on some files.
+ The fix can slightly change the format of the output.
+
+2.1.1
+-----
+
+ - Rename keys in the dicts returned by :py:func:`mgf.read` to facilitate
+ writing code working with both MGF and mzML.
+
+ - The items yielded by :py:func:`fasta.read` now have attributes `description`
+ and `sequence`.
+
+2.1.0
+-----
+
+ - New sets of retention coefficients in :py:mod:`achrom`.
+
+ - :py:class:`mass.Composition` now only stores non-zero ints.
+
+ - :py:mod:`fasta` now has tools for parsing of FASTA headers.
+
+ - File parsers now implement the `context manager protocol
+ <http://docs.python.org/reference/datamodel.html#with-statement-context-managers>`_.
+ We recommend using `with` statements to avoid resource leaks.
+
+API changes
+...........
+
+ - 'pepmass' is now a tuple in the output of :py:func:`mgf.read` (to allow
+ reading precursor intensities).
+
+ - new function :py:func:`fasta.parse` for convenient parsing of FASTA headers.
+
+ - :py:data:`fasta.std_parsers` stores parsers for common UniProt header formats.
+
+ - new parameter *parser* in :py:func:`fasta.read` allows to apply parsing while
+ reading a FASTA file.
+
+ - `close` parameter removed in all functions that do file I/O. The unified
+ behavior is: if the parameter is a file object, it won't be closed by the
+ function. If a file path is given, the file object will be created and closed
+ inside the corresponding function.
+
+2.0.3
+-----
+
+ - Added new class :py:class:`pyteomics.mass.Unimod`. The interface is
+ experimental and may change.
+
+ - Improved :py:func:`iterfind` function in XML-reading modules.
+
+ - :py:class:`pyteomics.mass.Composition` objects now support multiplication by
+ :py:class:`int`.
+
+ - Bugfix in :py:func:`auxiliary.linear_regression`.
+
+2.0.2
+-----
+
+ - Added new function :py:func:`iterfind` in :py:mod:`pyteomics.mzid`,
+ :py:mod:`pyteomics.pepxml` and :py:mod:`pyteomics.mzml`.
+
+2.0.1
+-----
+
+API changes
+...........
+
+ - :py:func:`pyteomics.parser.peptide_length` is renamed to
+ :py:func:`pyteomics.parser.length`.
+
+2.0.0
+-----
+
+ - Added :py:mod:`mzid` module for parsing of mzIdentML files.
+
+ - Fixed bugs, improved tests.
+
+API changes
+...........
+
+ - top-module functions in :py:mod:`fasta`, :py:mod:`mgf`, :py:mod:`mzml`,
+ :py:mod:`pepxml`, as well as :py:mod:`mzid`, are now called :py:func:`read`.
+
+ - in :py:mod:`parser`, :py:func:`parse_sequence` renamed to :py:func:`parse`.
+ It now accepts an optional parameter `allow_unknown_modifications`.
+
+ - :py:func:`mgf.write_mgf` and :py:func:`fasta.write_fasta` renamed to
+ :py:func:`write`.
+
+ - the output format of all :py:func:`read` functions has changed.
+
+1.2.5
+-----
+
+ - Include Apache license version 2.0:
+ http://www.opensource.org/licenses/Apache-2.0
+
+ - Minor bugfix in :py:mod:`pyteomics.fasta`.
+
+1.2.4
+-----
+
+ - Changes in :py:mod:`pyteomics.mass`.
+
+API changes
+...........
+
+ - :py:class:`Composition` objects can be created using positional first
+ argument, which will be treated as a sequence or (upon failure) as a formula.
+ This means that all functions relying on Composition
+ (:py:func:`calculate_mass`, :py:func:`most_probable_isotopic_composition`,
+ :py:func:`isotopic_composition_abundance`) allow that as well. However, it's
+ of no use for the latter.
+
+ - :py:class:`Composition` entries for modifications can be added to *aa_comp*
+ and used in composition and mass calculations. This way the specified group
+ will be added to any residue bearing this modification.
+
+ - That being said, the :py:func:`add_modifications` function is not needed
+ anymore and has been removed.
+
+ - Addition and subtraction of :py:class:`Composition` objects now produces a
+ :py:class:`Composition` object, allowing addition/subtraction of multiple
+ objects.
+
+ - :py:class:`Composition` is now a subclass of
+ :py:class:`collections.defaultdict` so one can safely retrieve values
+ without checking if a key exists.
+
+1.2.3
+-----
+
+ - :py:func:`pyteomics.parser.isoforms` now allows terminal modifications.
+
+ - Bugfixes in :py:func:`pyteomics.parser.parse_sequence`.
+
+ - New function :py:func:`pyteomics.parser.tostring` converts parsed sequences
+ to strings.
+
+ - Helper function :py:func:`pyteomics.parser.is_modX` added to check *modX* labels.
+
+API changes
+...........
+
+ - :py:func:`pyteomics.parser.isoforms` now returns a generator object
+
+1.2.2
+-----
+
+ - Bugfix in :py:mod:`pyteomics.pepxml`: modification info is now extracted.
+ - New optional bool argument 'split' in :py:func:`pyteomics.parser.parse_sequence()`
+ allows to generate a list of tuples where modifications are separated from the
+ residues instead of a regular list of labels. In *labels* not only *modX* labels
+ are now allowed, but also separate *mod* prefixes. Such modifications are
+ assumed to be applicable to any residue.
+
+
+1.2.1
+-----
+
+ - Memory usage **significantly** decreased when parsing large mzML and pepXML
+ files.
+
+1.2.0
+-----
+
+ - Added support for Python 3. Python 2.7 is still supported, Python 2.6 is not.
+
+1.1.1
+-----
+
+ - New function called :py:func:`add_modifications()` added in
+ :py:mod:`pyteomics.mass`. It updates *aa_comp*.
+ - Also, :py:func:`pyteomics.parser.isoforms` is a new function to get
+ all possible modified sequences of a peptide.
+
+1.1.0
+-----
+
+ - New module added - :py:mod:`pyteomics.mgf`. It is intended for reading and
+ writing files in Mascot Generic Format.
+
+1.0.2
+-----
+
+ - In :py:mod:`pyteomics.pepxml` module, now all search hits are read from file
+ (not only the top hit).
+
+API changes:
+............
+
+ - :py:func:`pyteomics.pepxml.read`: information specific to search hits is now
+ stored in a list under the ``'search_hits'`` key. The list is sorted by hit
+ rank.
+
+
+1.0.1
+-----
+
+ - Fix compatibility issues in :py:mod:`pyteomics.pepxml` module.
+
+1.0.0
+-----
+
+ - The first public release of Pyteomics.
+
+API changes:
+............
+
+ - :py:mod:`pyteomics.achrom`: rename ``'length correction factor'`` to
+ ``'length correction parameter'``.
+
+ - :py:func:`pyteomics.achrom.get_RCs_vary_lcf` was renamed to
+ :py:func:`pyteomics.achrom.get_RCs_vary_lcp`.
+ - `length_correction_factor` keyword argument of
+ :py:func:`pyteomics.achrom.get_RCs` was renamed to `lcp`.
+
diff --git a/PKGBUILD b/PKGBUILD
new file mode 100644
index 000000000000..811249bb4941
--- /dev/null
+++ b/PKGBUILD
@@ -0,0 +1,22 @@
+# Maintainer: Lev Levitsky <levlev at mail dot ru>
+pkgname=python2-pyteomics
+pkgver=3.0.1
+pkgrel=1
+pkgdesc="A framework for proteomics data analysis."
+arch=('any')
+url="http://pythonhosted.org/pyteomics"
+license=('Apache')
+depends=('python2')
+optdepends=('python2-matplotlib: for pylab_aux module, optional' \
+ 'python2-lxml: for XML parsing modules, recommended' \
+ 'python2-numpy: for lots of features, highly recommended')
+options=(!emptydirs)
+source=("https://pypi.python.org/packages/source/p/pyteomics/pyteomics-${pkgver}.tar.gz")
+md5sums=('2c838cc1c16dce69148662b883c755b9')
+changelog="CHANGELOG"
+package() {
+ cd "${srcdir}/pyteomics-${pkgver}"
+ python2 setup.py install --root="$pkgdir/" --optimize=1
+}
+
+# vim:set ts=2 sw=2 et: