Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError during import #5207

Closed
jhuldtgren opened this issue Apr 25, 2024 · 2 comments · Fixed by #5564
Closed

MemoryError during import #5207

jhuldtgren opened this issue Apr 25, 2024 · 2 comments · Fixed by #5564

Comments

@jhuldtgren
Copy link

Trying to import an album results in a MemoryError, the reason seems to be that one of the candidates it finds on discogs is an album with 6666 tracks. Disabling the discogs plugin so this is not a candidate "fixes" it and lets the import proceed. While albums with that many tracks are not the norm, is there something which can be tweaked so beets can handle that?

Problem

$ beet -vv import -a SomeAlbum
...
Success. Distance: 0.06
Sending event: albuminfo_received
Candidate: Lord Poozix - Grim And Frostbitten Penguins (29948101)
Computing track assignment...
Traceback (most recent call last):
  File "/usr/local/bin/beet", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/beets/ui/__init__.py", line 1285, in main
    _raw_main(args)
  File "/usr/local/lib/python3.10/site-packages/beets/ui/__init__.py", line 1272, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/usr/local/lib/python3.10/site-packages/beets/ui/commands.py", line 973, in import_func
    import_files(lib, paths, query)
  File "/usr/local/lib/python3.10/site-packages/beets/ui/commands.py", line 943, in import_files
    session.run()
  File "/usr/local/lib/python3.10/site-packages/beets/importer.py", line 340, in run
    pl.run_parallel(QUEUE_SIZE)
  File "/usr/local/lib/python3.10/site-packages/beets/util/pipeline.py", line 446, in run_parallel
    raise exc_info[1].with_traceback(exc_info[2])
  File "/usr/local/lib/python3.10/site-packages/beets/util/pipeline.py", line 311, in run
    out = self.coro.send(msg)
  File "/usr/local/lib/python3.10/site-packages/beets/util/pipeline.py", line 193, in coro
    func(*(args + (task,)))
  File "/usr/local/lib/python3.10/site-packages/beets/importer.py", line 1376, in lookup_candidates
    task.lookup_candidates()
  File "/usr/local/lib/python3.10/site-packages/beets/importer.py", line 660, in lookup_candidates
    autotag.tag_album(self.items, search_ids=self.search_ids)
  File "/usr/local/lib/python3.10/site-packages/beets/autotag/match.py", line 466, in tag_album
    _add_candidate(items, candidates, matched_candidate)
  File "/usr/local/lib/python3.10/site-packages/beets/autotag/match.py", line 372, in _add_candidate
    mapping, extra_items, extra_tracks = assign_items(items, info.tracks)
  File "/usr/local/lib/python3.10/site-packages/beets/autotag/match.py", line 105, in assign_items
    matching = Munkres().compute(costs)
  File "/usr/local/lib/python3.10/site-packages/munkres.py", line 144, in compute
    self.path = self.__make_matrix(self.n * 2, 0)
  File "/usr/local/lib/python3.10/site-packages/munkres.py", line 181, in __make_matrix
    matrix += [[val for j in range(n)]]
  File "/usr/local/lib/python3.10/site-packages/munkres.py", line 181, in <listcomp>
    matrix += [[val for j in range(n)]]
MemoryError

This is the discogs release which triggers the bug:

https://www.discogs.com/release/29948101-Lord-Poozix-Grim-And-Frostbitten-Penguins

Setup

  • OS: OpenBSD 7.4
  • Python version: Python 3.10.13
  • beets version: 1.60
  • Turning off plugins made problem go away (yes/no): yes

My configuration (output of beet config) is:

$ beet config
lyrics:
    bing_lang_from: []
    auto: yes
    bing_client_secret: REDACTED
    bing_lang_to:
    google_API_key: REDACTED
    google_engine_ID: REDACTED
    genius_api_key: REDACTED
    fallback:
    force: no
    local: no
    sources:
    - google
    - musixmatch
    - genius
    - tekstowo
library: /home/johan/.config/beets/library.db
directory: /music

plugins:
- bucket
- chroma
- discogs
- duplicates
- edit
- embedart
- fetchart
- fromfilename
- info
- inline
- lastgenre
- lyrics
- mbsync
- missing
- permissions
- plexupdate
- replaygain
- rewrite
- web
pluginpath: /home/johan/.config/beets/plugins
ignore: .AppleDouble ._* *~ .DS_Store
ignore_hidden: yes
asciify_paths: yes
art_filename: '%asciify{%lower{$album}}'
original_date: yes
permissions:
    dir: 775
    file: 664

replace:
    -\.\.+: '-'
    \.\.+$: ''
    \.$: ''
    \.\.\.: _
    '[\\/]': ''
    ^\.: ''
    '[\x00-\x1f]': ''
    \._: '-'
    ':': ''
    '[<>"\?\*\|\!]': ''
    \s+$: ''
    ^\s+: ''
    \[\]: ''
    '&': _and_
    '#': _
    \,: ''
    \;: ''
    '''': ''
    \(: ''
    \): ''
    \?: ''
    \!: ''
    "\u2026": ''
    \s: _
    _-_: _
    _\._: _
    _+: _
    "\xBF": ''
    '[\xc2\xbf]': ''

import:
    copy: yes
    duplicate_action: keep
    write: yes
    languages: en sv no dk is
item_fields:
    path_format: "if format == 'FLAC':\n        return 'flac'\nreturn 'mp3'\n"
    path_genre: "for p in ['alternative', 'reggae', 'world music']:\n    if p in genre.lower():\n       return 'pop'\nfor l in ['arabic', 'latin', 'merengue', 'reggaeton', 'salsa', 'tango']:\n    if l in genre.lower():\n        return 'latin'\nfor s in ['musical', 'soundtrack', 'soundtracks']:\n    if s in genre.lower():\n        return 'soundtracks'\nfor r in ['hip-hop', 'hip hop', 'r&b', 'rap']:\n    if r in genre.lower():\n        return 'rnb'\nfor i in ['industrial rock', 'country', 'blues', 'jazz']:\n    if i in genre.lower():\n        return 'rock'\nfor m in ['ambient', 'death', 'doom', 'folk', 'grind', 'core', 'punk', 'industrial']:\n    if m in genre.lower():\n        return 'metal'\nfor g in ['classical', 'comedy', 'humour', 'latin', 'metal', 'pop', 'rock']:\n    if g in genre.lower():\n        return g\nif not genre:\n    return 'fixit'\n"
    path_composer: "if not composer_sort:\n        return composer\nreturn composer_sort\n"
album_fields:
    path_bucket: "for g in ['classical', 'metal', 'rock', 'country', 'blues', 'jazz', 'grindcore', 'funeral doom', 'industrial', 'humour']:\n    if g in genre.lower():\n        return g\n"
    path_year: "if original_year == 0000:\n        return year\nreturn original_year\n"
    path_various: "for v in ['musical', 'soundtrack', 'soundtracks']:\n    if v in genre.lower():\n        return 0\nreturn 1\n"
    path_artist: "if not albumartist_sort:\n        return albumartist\nreturn albumartist_sort\n"
bucket:
    extrapolate: yes
    bucket_alpha:
    - a
    - b
    - c
    - d
    - e
    - f
    - g
    - h
    - i
    - j
    - k
    - l
    - m
    - n
    - o
    - p
    - q
    - r
    - s
    - t
    - u
    - v
    - w
    - x
    - y
    - z
    - 0-9
    bucket_alpha_regex:
        a: "^[\xC6]"
        0-9: ^[0-9]
    bucket_year: []

paths:
    default: $path_format/%lower{$path_genre}/%if{$path_bucket,(%lower{%bucket{$path_artist,alpha}})}/%lower{$path_artist}/${year}-%lower{$album}/${track}-%lower{$title}
    comp: $path_format/%lower{$path_genre}/%if{$path_bucket,v}/%if{$path_various,various_artists}/%lower{$path_artist}/${year}-%lower{$album}/${track}-%lower{$title}
    singleton: $path_format/%lower{$path_genre}/%if{$path_bucket,(%lower{%bucket{$artist_sort,alpha}})}/%lower{$artist_sort}/non-album/${track}-%lower{$title}
    genre:Classical: $path_format/%lower{$path_genre}/%if{$path_bucket,(%lower{%bucket{$path_composer,alpha}})}/%lower{$path_composer}/${year}-%lower{$album}/${track}-%lower{$title}
    series::.: $path_format/%lower{$path_genre}/%if{$path_bucket,v}/various_artists/%lower{$series}/${year}-%lower{$album}/${track}-%lower{$title}
web:
    host: REDACTED
    port: REDACTED
    cors: ''
    cors_supports_credentials: no
    reverse_proxy: no
    include_paths: no
    readonly: yes
chroma:
    auto: yes
acoustid:
    apikey: REDACTED
replaygain:
    auto: no
    albumgain: yes
    backend: ffmpeg
    command: /usr/local/bin/ffmpeg
    overwrite: no
    threads: 12
    parallel_on_import: no
    per_disc: no
    peak: 'true'
    targetlevel: 89
    r128: [Opus]
    r128_targetlevel: 84
metalarchives:
    lyrics: yes
    lyrics_search: yes
edit:
    itemfields:
    - album
    - albumartist
    - artist
    - track
    - title
    - year
    albumfields:
    - albumartist
    - album
    - year
    - albumtype
    - genre
    - original_year
    - original_month
    - original_day
    ignore_fields: id path
duplicates:
    album: no
    checksum: ''
    copy: ''
    count: no
    delete: no
    format: ''
    full: no
    keys: []
    merge: no
    move: ''
    path: no
    tiebreak: {}
    strict: no
    tag: ''
discogs:
    apikey: REDACTED
    apisecret: REDACTED
    tokenfile: discogs_token.json
    source_weight: 0.5
    user_token: REDACTED
    separator: ', '
    index_tracks: no
rewrite: {}
embedart:
    maxwidth: 0
    auto: yes
    compare_threshold: 0
    ifempty: no
    remove_art_file: no
    quality: 0
fetchart:
    auto: yes
    minwidth: 0
    maxwidth: 0
    quality: 0
    max_filesize: 0
    enforce_ratio: no
    cautious: no
    cover_names:
    - cover
    - front
    - art
    - album
    - folder
    sources:
    - filesystem
    - coverart
    - itunes
    - amazon
    - albumart
    google_key: REDACTED
    google_engine: REDACTED
    fanarttv_key: REDACTED
    lastfm_key: REDACTED
    store_source: no
    high_resolution: no
    deinterlace: no
    cover_format:
pathfields: {}
lastgenre:
    whitelist: yes
    min_weight: 10
    count: 1
    fallback:
    canonical: no
    source: album
    force: yes
    auto: yes
    separator: ', '
    prefer_specific: no
    title_case: yes
missing:
    count: no
    total: no
    album: no
@theraspb3rry
Copy link

theraspb3rry commented Dec 16, 2024

I just hit the same thing (searched this discogs ID and found this issue! haha) and i added a line like this so i dont keep hitting it and i can continue using discogs during a large import.

Line 408 (in get_album_info) on beetsplug/discogs.py

        if result.data["id"] == 29948101:
            self._log.debug("Skipping huge discogs release. https://www.discogs.com/release/29948101-Lord-Poozix-Grim-And-Frostbitten-Penguins")
            return None

Would make sense to put a more generic (say 500) track limit to catch any other weird entries though

@snejus
Copy link
Member

snejus commented Dec 27, 2024

Would you mind giving this update a try with your import by any chance?

I've got an album with 136 tracks and I tried matching it to this crazy release - it seems to work OK on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants