Default Branch

1d8fca72ae · metal : add GGML_OP_REPEAT kernels (#7557) · Updated 2024-05-27 11:10:19 +02:00

Branches

20054a38c1 · Fix directory name · Updated 2023-05-27 01:00:08 +02:00

2418
1

a1cdd29cd2 · ggml : rms_norm in chunks · Updated 2023-05-20 09:15:54 +02:00

2439
2

95dc4d7270 · Merge 'origin/master' into steering · Updated 2023-05-19 22:19:57 +02:00

2441
9

40ec4882c4 · ggml : use F16C conversion when available · Updated 2023-05-17 19:05:51 +02:00

2450
1

a3e6d62283 · cuda : alternative q4_q8 kernel · Updated 2023-05-12 16:02:39 +02:00

2484
8

e116eb638c · ggml : speed-up Q5_0 + Q5_1 at 4 threads · Updated 2023-05-11 17:51:56 +02:00

2486
20

4baa85633a · Fix build · Updated 2023-05-07 03:44:07 +02:00

2494
5

31ff9e2e83 · ci : add cublas to windows release · Updated 2023-05-03 23:21:20 +02:00

2509
1

102cd98074 · ggml : Q4_3c using 2x "Full range" approach · Updated 2023-04-23 13:56:44 +02:00

2590
8

71e6ae3779 · ggml : continue from #729 (wip) · Updated 2023-04-22 17:49:07 +02:00

2590
7

a0242a833c · Minor, plus rebase on master · Updated 2023-04-22 16:07:10 +02:00

2590
2

4b8d5e3890 · llama : quantize attention results · Updated 2023-04-22 10:35:13 +02:00

2595
1

1506737499 · Add mmap pages stats (disabled by default) · Updated 2023-04-16 18:22:30 +02:00

2645
1

36ddd12924 · llama : add flash attention (demo) · Updated 2023-04-05 21:12:04 +02:00

2711
1

c9c820ff36 · Added support for _POSIX_MAPPED_FILES if defined in source (#564) · Updated 2023-03-28 23:26:25 +02:00

2945
8

4aeee216fd · Regroup q4_1 dot addition for better numerics. · Updated 2023-03-24 21:20:57 +01:00

2826
2

66ea164e1d · Kahan summation on Q4_1 · Updated 2023-03-23 04:28:51 +01:00

2853
2

711224708d · Break up loop for numeric stability · Updated 2023-03-23 03:14:44 +01:00

2853
2

3a0dcb3920 · Implement server mode. · Updated 2023-03-22 18:34:19 +01:00

2854
5
dev

a169bb889c · Gate signal support on being on a unixoid system. (#74) · Updated 2023-03-13 04:08:01 +01:00

2960
0
Included