Skip to content

Tags: ggml-org/llama.cpp

Tags

b7046

Toggle b7046's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml-cpu : use template for argsort (#17222)

b7045

Toggle b7045's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CANN: Add cross_entropy_loss op support (#16886)

* update L2_NORM op support

* update L2_NORM op support

* remove extra whitespace

* cann: update cross_entropy_loss op support

* remove trailing whitespaces

* rebase the latest code in the main repository and remove the l2_norm operator that already exists in another pull request.

* undo the l2_norm operator deletion

b7044

Toggle b7044's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: fuse rope + set_rows (#16884)

* CUDA: add fused rope

* move k forward_expand up

* create helper function instead of re-using params

* make assert statement more in line with comment

* rope_norm: coalesced writes to global mem

b7042

Toggle b7042's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
vocab : correct bounds check for UGM XCDA array access (#17215)

b7041

Toggle b7041's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: static assert to prevent misuse of memcpy_1 (#17198)

b7039

Toggle b7039's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml : use std::sort in ggml_argsort CPU implementation (#17211)

* ggml : use std::sort in ggml_argsort CPU implementation

* cont : add missing header

b7037

Toggle b7037's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server: (refactor) implement generator-based API for task results (#1…

…7174)

* server: (refactor) implement generator-based API for task results

* improve

* moving some code

* fix "Response ended prematurely"

* add sink.done before return false

* rm redundant check

* rm unused var

* rename generator --> reader

b7035

Toggle b7035's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server: move res_error/res_ok to static function (#17167)

b7034

Toggle b7034's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml-cpu: handle 3d tensors in repack mat_mul (#17030)

* ggml-cpu: handle 3d tensors in repack mul_mat

* Removed unnecessary branch, removed need for <algorithm>

* Fixed dst_ptr pointer in chunk + clang_format

* GGML_ASSERT to check wdata within bounds

* Accidental ggml.h inclusion

* Improved GGML_ASSERT on wdata boundaries

b7033

Toggle b7033's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cmake : cleanup (#17199)