Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naive gradient descent example #75

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

wavewave
Copy link
Member

Using ad library, a minimum of a non-trivial function (Rosenbrock function) is computed both by pure Haskell and codegen C linked via FFI, and they are compared.

@wavewave wavewave force-pushed the wavewave-flakes-clang-2-ad branch 2 times, most recently from 79243e9 to ba9a93d Compare May 31, 2022 22:43
@wavewave
Copy link
Member Author

The result shows matched results (first 10 points in minimum finding history) from Haskell and generated-C

$ cabal v2-run categorifier-c-examples:grad-descent
Up to date
pure haskell
(0.1336,0.322)
(0.1671818315776,0.261169792)
(0.19943425546855276,0.21452578656192822)
(0.22938616556178515,0.17957543370040177)
(0.2564473526246645,0.15418394955054915)
(0.2803883026880033,0.13650020857407913)
(0.30127237915165805,0.12492368691611518)
(0.31936335352259115,0.11809195882083222)
(0.33503265962933526,0.11487215737130488)
(0.3486838274962453,0.11434710250070511)
codegen C
(0.1336,0.322)
(0.1671818315776,0.261169792)
(0.19943425546855276,0.21452578656192822)
(0.22938616556178515,0.17957543370040177)
(0.2564473526246645,0.15418394955054915)
(0.2803883026880033,0.13650020857407913)
(0.30127237915165805,0.12492368691611518)
(0.31936335352259115,0.11809195882083222)
(0.33503265962933526,0.11487215737130488)
(0.3486838274962453,0.11434710250070511)

@wavewave
Copy link
Member Author

and generated C file (gradient of Rosenbrock function)

$ cat /tmp/dRosenbrock.c

/* This file was AUTOMATICALLY GENERATED from a collection of Haskell expressions. */
/* Any modification you make here WILL BE OVERWRITTEN in short order. */
/* This function was generated using the `CExpr` DSL defined in `code_generation/ktypes`. */
/* See `generateCExprFunction` for the top-level entry point. */
/* Questions or bugs?  Please open a Jira ticket against the Tools team. */
          

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>

// NOLINTNEXTLINE(readability-identifier-naming)
void dRosenbrock (const bool input_bool[0] __attribute__((unused))
, const int8_t input_int8_t[0] __attribute__((unused))
, const int16_t input_int16_t[0] __attribute__((unused))
, const int32_t input_int32_t[0] __attribute__((unused))
, const int64_t input_int64_t[0] __attribute__((unused))
, const uint8_t input_uint8_t[0] __attribute__((unused))
, const uint16_t input_uint16_t[0] __attribute__((unused))
, const uint32_t input_uint32_t[0] __attribute__((unused))
, const uint64_t input_uint64_t[0] __attribute__((unused))
, const float input_float[0] __attribute__((unused))
, const double input_double[4] __attribute__((unused))
, bool output_bool[0] __attribute__((unused))
, int8_t output_int8_t[0] __attribute__((unused))
, int16_t output_int16_t[0] __attribute__((unused))
, int32_t output_int32_t[0] __attribute__((unused))
, int64_t output_int64_t[0] __attribute__((unused))
, uint8_t output_uint8_t[0] __attribute__((unused))
, uint16_t output_uint16_t[0] __attribute__((unused))
, uint32_t output_uint32_t[0] __attribute__((unused))
, uint64_t output_uint64_t[0] __attribute__((unused))
, float output_float[0] __attribute__((unused))
, double output_double[2] __attribute__((unused))) {
  const double v0 = 0x0p+0 /* 0.0 */;
  const double v1 = 0x0p+0 /* 0.0 */;
  const double v2 = input_double[2];
  const double v3 = 0x1p0 /* 1.0 */;
  const double v4 = -(v3);
  const double v5 = input_double[3];
  const double v6 = v2 * v2;
  const double v7 = v5 - v6;
  const double v8 = input_double[1];
  const double v9 = 0x1p0 /* 1.0 */;
  const double v10 = 0x1p0 /* 1.0 */;
  const double v11 = v9 * v10;
  const double v12 = v1 + v11;
  const double v13 = v8 * v12;
  const double v14 = v1 + v13;
  const double v15 = v7 * v14;
  const double v16 = v1 + v15;
  const double v17 = v7 * v14;
  const double v18 = v16 + v17;
  const double v19 = v4 * v18;
  const double v20 = v1 + v19;
  const double v21 = v2 * v20;
  const double v22 = v1 + v21;
  const double v23 = v2 * v20;
  const double v24 = v22 + v23;
  const double v25 = input_double[0];
  const double v26 = v25 - v2;
  const double v27 = 0x1p0 /* 1.0 */;
  const double v28 = v27 * v10;
  const double v29 = v1 + v28;
  const double v30 = v26 * v29;
  const double v31 = v1 + v30;
  const double v32 = v26 * v29;
  const double v33 = v31 + v32;
  const double v34 = v4 * v33;
  const double v35 = v24 + v34;
  const double v36 = v0 + v35;
  const double v37 = 0x1p0 /* 1.0 */;
  const double v38 = v37 * v18;
  const double v39 = v1 + v38;
  const double v40 = v0 + v39;
  output_double[1] = v40;
  output_double[0] = v36;

}

@wavewave wavewave force-pushed the wavewave-flakes-clang-2-ad branch from 51103de to 9d66c45 Compare June 2, 2022 19:25
@wavewave wavewave changed the base branch from wavewave-flakes-clang-2 to wavewave-flakes-clang June 2, 2022 19:25
@wavewave wavewave linked an issue Jun 4, 2022 that may be closed by this pull request
@wavewave wavewave changed the base branch from wavewave-flakes-clang to master June 11, 2022 18:50
@wavewave wavewave marked this pull request as ready for review June 21, 2022 03:59
@wavewave wavewave requested review from zliu41 and sellout June 21, 2022 03:59
@wavewave wavewave changed the title [WIP] Naive gradient descent example Naive gradient descent example Jun 21, 2022
Copy link
Member

@sellout sellout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also use a bunch of Haddock / comments about what it's doing and why – e.g., what does this example illustrate; why do we need f parameters in this case (because ad, yada yada); why is wrap_rosenbrockF uncurried; etc.


$(Categorify.separately 'rosenbrockF [t|C.Cat|] [pure [t|C|]])

$(Categorify.separately 'dRosenbrockF [t|C.Cat|] [pure [t|C|]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separately no longer exists on master. This should be Categorify.function. Also, we don't need $(...) on top-level splices.


$(embedFunction "rosenbrockF" wrap_rosenbrockF)

$(embedFunction "dRosenbrockF" wrap_dRosenbrockF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same re: $(...) on top-level splices.


dRosenbrock :: forall a. Num a => (a, a) -> (a, a) -> (a, a)
dRosenbrock (a, b) (x, y) =
let rosenbrock' :: forall s. Reifies s Tape => [Reverse s a] -> Reverse s a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a Pair type defined in categorifier, that seems better than [] here, since it's not partial.

in (dfdx, dfdy)

rosenbrockF :: KType1 f => Input f -> Output f
rosenbrockF (Input (Param a b) (XY x y)) = Output $ rosenbrock (a, b) (x, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we can get rid of Input and Output, and just have

 rosenbrockF :: KType1 f => (Param f, XY f) -> f Double
 rosenbrockF (Param a b, XY x y) = rosenbrock (a, b) (x, y)

I'd rather have this be Param f -> XY f -> f Double, but I'm guessing it needs to be uncurried for embedFunction?

dRosenbrockF :: forall f. (KType1 f) => Input f -> XY f
dRosenbrockF (Input (Param a b) (XY x y)) =
let (dfdx, dfdy) = dRosenbrock (a, b) (x, y)
in XY dfdx dfdy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar change here

dRosenbrockF :: forall f. (KType1 f) => (Param f, XY f) -> XY f
dRosenbrockF (Param a b, XY x y) = uncurry XY $ dRosenbrock (a, b) (x, y)


$(Categorify.separately 'rosenbrockF [t|C.Cat|] [pure [t|C|]])

$(Categorify.separately 'dRosenbrockF [t|C.Cat|] [pure [t|C|]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orthogonal to this change, but I do want to put together an example that does something similar to this, without ad, like

Categorify.expression @C.Cat (unD (Categorify.expression @ConCat.RAD dRosenbrockF))

to illustrate nested categorification (which may not actually work yet).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide non-trivial "ad" example
2 participants