Add a lora dense layer #1263

mattdangerw · 2023-10-04T23:41:39Z

#1264 shows how this will eventually fit together.

mattdangerw · 2023-10-05T00:14:06Z

Note that this is Keras 3/Keras Core only because that library will allow you to set trainable on individual parameters (tf.keras does not). It doesn't seem worth the effort to build backwards compat here, this can be a strictly forward looking Keras 3 API.

ianstenbit · 2023-10-06T20:05:00Z

keras_nlp/conftest.py

        not backend_config.backend() == "tensorflow",
        reason="tests only run on tf backend",
    )
+    multi_backend_only = pytest.mark.skipif(


maybe keras_3_only?

hmm on second thought i'll leave it as multi-backend for now. because it can run with keras 3 or keras core for the time being. and we will have zero coverage for the actual keras 3 code path till we get that pip.

keras_nlp/layers/modeling/lora_dense.py

ianstenbit · 2023-10-06T21:18:53Z

keras_nlp/layers/modeling/lora_dense.py

+        inner_dense,
+        rank=8,
+        alpha=32.0,
+        lora_a_initializer="variance_scaling",


Does it make sense to include "lora" as a prefix for this variable name, given that it's already a LoraDense layer/

I think it's okay, because the two dense layers in LoRA are called "lora_A" and "lora_B" (in the official code). Calling this a_initializer would look weird :P.

Yeah, I wanted to avoid layer.a and layer.b. We could come up with our own names, layer.inner_kernel_update and layer.outer_kernel_update, but I suspect lora_a and lora_b will be more recognizable to people.

Sounds fine to me.

keras_nlp/layers/modeling/lora_dense_test.py

abheesht17 · 2023-10-07T01:05:03Z

keras_nlp/layers/modeling/lora_dense.py

+        self,
+        inner_dense,
+        rank=8,
+        alpha=32.0,


Generally, alpha = rank. I know the guide has alpha = 32., and rank = 4, but it was an oversight on my part. The authors state that they went with alpha = rank, and tuned the learning rate.

Source: Section 4.1 in https://openreview.net/forum?id=nZeVKeeFYf9

Thanks! Seems like we could

Default alpha=8., so it's obvious the type.

Default alpha=None and assign float(rank) if unset.

Just expose scale directly, and default scale=1..

Don't expose anything.

Maybe just alpha=8.? Simple and what people would expect?

I think this approach LG

abheesht17

Was passing through, left some NITs

keras_nlp/layers/modeling/lora_dense.py

abheesht17 · 2023-10-07T01:16:00Z

keras_nlp/layers/modeling/lora_dense.py

+        inner_dense,
+        rank=8,
+        alpha=32.0,
+        lora_a_initializer="variance_scaling",


I think it's okay, because the two dense layers in LoRA are called "lora_A" and "lora_B" (in the official code). Calling this a_initializer would look weird :P.

Co-authored-by: Abheesht <[email protected]>

mattdangerw · 2023-10-09T22:47:32Z

/gcburn

mattdangerw · 2023-10-09T22:47:40Z

I think this is ready for another round!

mattdangerw · 2023-10-09T23:07:02Z

/gcbrun

mattdangerw · 2023-10-10T19:58:48Z

/gcbrun

ianstenbit

Looks good!

ianstenbit · 2023-10-11T21:44:50Z

keras_nlp/layers/modeling/lora_dense.py

+        lora_a_initializer: The initializer to use for the inner projection
+            from layer inputs to the inner `rank` intermediate outputs.
+        freeze_kernel: If true, the kernel of the inner dense layer will have
+            `trainable` set to False.


IIRC we backtick False in these contexts?

mattdangerw requested a review from ianstenbit October 5, 2023 00:03

ianstenbit suggested changes Oct 6, 2023

View reviewed changes

abheesht17 reviewed Oct 7, 2023

View reviewed changes

abheesht17 approved these changes Oct 7, 2023

View reviewed changes

mattdangerw force-pushed the lora-dense branch from a1abbe6 to 3a8649b Compare October 9, 2023 21:26

mattdangerw and others added 2 commits October 9, 2023 15:47

Add a lora dense layer

15a66f0

Co-authored-by: Abheesht <[email protected]>

address comments

1313bcb

mattdangerw force-pushed the lora-dense branch from 3a8649b to 1313bcb Compare October 9, 2023 22:47

Fix merge conflict

8ee335d

mattdangerw requested a review from fchollet October 10, 2023 19:27

mattdangerw added 2 commits October 10, 2023 13:24

minor fix

cd33377

another einsum restriction

630c4f2

ianstenbit approved these changes Oct 11, 2023

View reviewed changes

Last doc nit from Ian

b8d86fe

mattdangerw merged commit 07e1cc2 into keras-team:master Oct 12, 2023
3 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a lora dense layer #1263

Add a lora dense layer #1263

mattdangerw commented Oct 4, 2023 •

edited

Loading

mattdangerw commented Oct 5, 2023

ianstenbit Oct 6, 2023

mattdangerw Oct 9, 2023

ianstenbit Oct 6, 2023

abheesht17 Oct 7, 2023

mattdangerw Oct 9, 2023

fchollet Oct 12, 2023

abheesht17 Oct 7, 2023

mattdangerw Oct 9, 2023

ianstenbit Oct 11, 2023

abheesht17 left a comment

abheesht17 Oct 7, 2023

mattdangerw commented Oct 9, 2023

mattdangerw commented Oct 9, 2023

mattdangerw commented Oct 9, 2023

mattdangerw commented Oct 10, 2023

ianstenbit left a comment

ianstenbit Oct 11, 2023

Add a lora dense layer #1263

Add a lora dense layer #1263

Conversation

mattdangerw commented Oct 4, 2023 • edited Loading

mattdangerw commented Oct 5, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abheesht17 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw commented Oct 9, 2023

mattdangerw commented Oct 9, 2023

mattdangerw commented Oct 9, 2023

mattdangerw commented Oct 10, 2023

ianstenbit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattdangerw commented Oct 4, 2023 •

edited

Loading