dav1d/NEWS

177 lines
6.0 KiB
Plaintext
Raw Normal View History

2020-06-17 07:01:20 +02:00
Changes for 0.7.1 'Frigatebird':
------------------------------
0.7.1 is a minor update on 0.7.0:
2020-06-21 00:02:59 +02:00
- ARM32 NEON optimizations for itxfm, which can give up to 28% speedup, and MSAC
2020-06-17 07:01:20 +02:00
- SSE2 optimizations for prep_bilin and prep_8tap
- AVX2 optimizations for MC scaled
- Fix a clamping issue in motion vector projection
2020-06-21 00:02:59 +02:00
- Fix an issue on some specific Haswell CPU on ipred_z AVX2 functions
2020-06-17 07:01:20 +02:00
- Improvements on the dav1dplay utility player to support resizing
2020-05-15 19:54:35 +02:00
Changes for 0.7.0 'Frigatebird':
------------------------------
0.7.0 is a major release for dav1d:
- Faster refmv implementation gaining up to 12% speed while -25% of RAM (Single Thread)
- 10b/12b ARM64 optimizations are mostly complete:
- ipred (paeth, smooth, dc, pal, filter, cfl)
- itxfm (only 10b)
- AVX2/SSSE3 for non-4:2:0 film grain and for mc.resize
- AVX2 for cfl4:4:4
- AVX-512 CDEF filter
- ARM64 8b improvements for cfl_ac and itxfm
- ARM64 implementation for emu_edge in 8b/10b/12b
- ARM32 implementation for emu_edge in 8b
2020-05-19 10:09:08 +02:00
- Improvements on the dav1dplay utility player to support 10 bit,
non-4:2:0 pixel formats and film grain on the GPU
2020-05-15 19:54:35 +02:00
2020-02-09 14:35:57 +01:00
Changes for 0.6.0 'Gyrfalcon':
------------------------------
0.6.0 is a major release for dav1d:
- New ARM64 optimizations for the 10/12bit depth:
- mc_avg, mc_w_avg, mc_mask
- mc_put/mc_prep 8tap/bilin
- mc_warp_8x8
2020-03-05 20:03:56 +01:00
- mc_w_mask
- mc_blend
2020-02-09 14:35:57 +01:00
- wiener
2020-02-25 10:21:42 +01:00
- SGR
- loopfilter
2020-02-09 14:35:57 +01:00
- cdef
2020-02-25 10:21:42 +01:00
- New AVX-512 optimizations for prep_bilin, prep_8tap, cdef_filter, mc_avg/w_avg/mask
2020-02-09 14:35:57 +01:00
- New SSSE3 optimizations for film grain
- New AVX2 optimizations for msac_adapt16
- Fix rare mismatches against the reference decoder, notably because of clipping
2020-02-25 10:21:42 +01:00
- Improvements on ARM64 on msac, cdef and looprestoration optimizations
- Improvements on AVX2 optimizations for cdef_filter
2020-02-09 14:35:57 +01:00
- Improvements in the C version for itxfm, cdef_filter
2019-12-02 18:19:06 +01:00
Changes for 0.5.2 'Asiatic Cheetah':
------------------------------------
0.5.2 is a small release improving speed for ARM32 and adding minor features:
- ARM32 optimizations for loopfilter, ipred_dc|h|v
- Add section-5 raw OBU demuxer
- Improve the speed by reducing the L2 cache collisions
- Fix minor issues
2019-10-25 18:46:28 +02:00
Changes for 0.5.1 'Asiatic Cheetah':
------------------------------------
0.5.1 is a small release improving speeds and fixing minor issues
compared to 0.5.0:
- SSE2 optimizations for CDEF, wiener and warp_affine
- NEON optimizations for SGR on ARM32
- Fix mismatch issue in x86 asm in inverse identity transforms
- Fix build issue in ARM64 assembly if debug info was enabled
- Add a workaround for Xcode 11 -fstack-check bug
2019-10-25 18:46:28 +02:00
2019-10-09 08:55:25 +02:00
Changes for 0.5.0 'Asiatic Cheetah':
2019-10-25 18:46:28 +02:00
------------------------------------
2019-10-09 08:55:25 +02:00
0.5.0 is a medium release fixing regressions and minor issues,
and improving speed significantly:
- Export ITU T.35 metadata
- Speed improvements on blend_ on ARM
- Speed improvements on decode_coef and MSAC
- NEON optimizations for blend*, w_mask_, ipred functions for ARM64
2019-10-09 08:55:25 +02:00
- NEON optimizations for CDEF and warp on ARM32
- SSE2 optimizations for MSAC hi_tok decoding
- SSSE3 optimizations for deblocking loopfilters and warp_affine
- AVX2 optimizations for film grain and ipred_z2
2019-10-09 08:55:25 +02:00
- SSE4 optimizations for warp_affine
- VSX optimizations for wiener
2019-10-09 08:55:25 +02:00
- Fix inverse transform overflows in x86 and NEON asm
- Fix integer overflows with large frames
- Improve film grain generation to match reference code
- Improve compatibility with older binutils for ARM
- More advanced Player example in tools
2019-10-09 08:55:25 +02:00
2019-05-22 00:30:20 +02:00
Changes for 0.4.0 'Cheetah':
----------------------------
- Fix playback with unknown OBUs
- Add an option to limit the maximum frame size
- SSE2 and ARM64 optimizations for MSAC
- Improve speed on 32bits systems
- Optimization in obmc blend
2019-07-27 14:08:18 +02:00
- Reduce RAM usage significantly
2019-08-02 23:35:51 +02:00
- The initial PPC SIMD code, cdef_filter
2019-07-27 14:08:18 +02:00
- NEON optimizations for blend functions on ARM
- NEON optimizations for w_mask functions on ARM
- NEON optimizations for inverse transforms on ARM64
- VSX optimizations for CDEF filter
2019-07-27 14:08:18 +02:00
- Improve handling of malloc failures
- Simple Player example in tools
2019-05-22 00:30:20 +02:00
2019-05-11 17:23:10 +02:00
Changes for 0.3.1 'Sailfish':
------------------------------
- Fix a buffer overflow in frame-threading mode on SSSE3 CPUs
- Reduce binary size, notably on Windows
- SSSE3 optimizations for ipred_filter
- ARM optimizations for MSAC
2019-04-24 11:42:54 +02:00
Changes for 0.3.0 'Sailfish':
------------------------------
This is the final release for the numerous speed improvements of 0.3.0-rc.
It mostly:
- Fixes an annoying crash on SSSE3 that happened in the itx functions
Changes for 0.2.2 (0.3.0-rc) 'Antelope':
-----------------------------
2019-03-13 23:39:00 +01:00
2019-04-19 09:16:39 +02:00
- Large improvement on MSAC decoding with SSE, bringing 4-6% speed increase
The impact is important on SSSE3, SSE4 and AVX2 cpus
2019-04-19 09:16:39 +02:00
- SSSE3 optimizations for all blocks size in itx
2019-10-09 08:55:25 +02:00
- SSSE3 optimizations for ipred_paeth and ipred_cfl (420, 422 and 444)
2019-04-19 09:16:39 +02:00
- Speed improvements on CDEF for SSE4 CPUs
- NEON optimizations for SGR and loop filter
- Minor crashes, improvements and build changes
2019-03-13 23:39:00 +01:00
2019-03-04 18:15:48 +01:00
Changes for 0.2.1 'Antelope':
----------------------------
2019-03-09 10:55:02 +01:00
- SSSE3 optimization for cdef_dir
- AVX2 improvements of the existing CDEF optimizations
2019-03-09 10:55:02 +01:00
- NEON improvements of the existing CDEF and wiener optimizations
- Clarification about the numbering/versionning scheme
2019-03-04 18:15:48 +01:00
2019-03-01 18:48:01 +01:00
Changes for 0.2.0 'Antelope':
2018-12-15 12:29:51 +01:00
----------------------------
2019-03-01 18:48:01 +01:00
- ARM64 and ARM optimizations using NEON instructions
- SSSE3 optimizations for both 32 and 64bits
- More AVX2 assembly, reaching almost completion
2018-12-15 12:29:51 +01:00
- Fix installation of includes
- Rewrite inverse transforms to avoid overflows
2019-03-01 18:48:01 +01:00
- Snap packaging for Linux
- Updated API (ABI and API break)
- Fixes for un-decodable samples
2018-12-15 12:29:51 +01:00
2018-12-11 15:14:56 +01:00
Changes for 0.1.0 'Gazelle':
----------------------------
2018-12-10 22:34:59 +01:00
Initial release of dav1d, the fast and small AV1 decoder.
- Support for all features of the AV1 bitstream
- Support for all bitdepth, 8, 10 and 12bits
2018-12-11 15:14:56 +01:00
- Support for all chroma subsamplings 4:2:0, 4:2:2, 4:4:4 *and* grayscale
- Full acceleration for AVX2 64bits processors, making it the fastest decoder
2018-12-10 22:34:59 +01:00
- Partial acceleration for SSSE3 processors
- Partial acceleration for NEON processors