Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arc] Implement memory initializers #7559

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

fzi-hielscher
Copy link
Contributor

Following #7480, This PR adds support for initialization of arcilator memories. The code still needs tidying and testing. But, as always, I appreciate early feedback on the overall approach.

Specifically, it adds the arc.initmem.filled and arc.initmem.randomized operations, enabling initialization with a constant value and runtime generated random values respectively. Both ops produce a SSA value of the newly added !arc.memory_initializer<* x *> type, indicating that they can be used to initialize memories irrespective of size and word type. Sadly, there is currently no front or middle end equivalent of these operations, so for the moment they mostly serve as a template showing how initialization can be handled with and without help from the runtime environment.

The memory_initializer value is passed as an optional argument to the arc.memory op. During LowerState it gets moved to the 'initital' pseudo clock-tree and a arc.initialize_memory op is inserted to associate the initializer with the memory's state. Finally, the newly added LowerMemoryInitializers pass converts the initializers to the low level state writes, which, depending on the initializer, may involve invoking the runtime environment with a reference to the storage. This utilizes the also newly added arc.call_environment caller and arc.environment_call callee op pair. Right now they don't do anything special, but I think it makes sense to separate calls to the runtime environment from generic function calls.

Speaking of the runtime environment: As discussed in #7445, the increasing complexity makes implementing it as effectively a header library impractical. I'm still working on restructuring it and it probably doesn't make much sense to land this PR here before that has been done (the new call ops also don't really belong in this PR). In the end, this should allow us to add an initializer op for .mem files, with the static reading and parsing logic stashed away in the runtime library.

@fzi-hielscher fzi-hielscher added the Arc Involving the `arc` dialect label Aug 28, 2024
Copy link
Member

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for working on this! I have few comments on designs but generally looks great! Relevantly I'm working on extending seq.firrmem/hlmem to take seq.immutable operand which looks like:

func.func @random() // external
hw.module @Foo(){
  %mem_random_init = seq.initial  {
    %0 = func.call @random : () -> i32
     ....
    %63 = func.call @random: () -> i32
    %init = hw.array_create %0, ..., %63: !hw.array<64xi32>
    seq.yield %init: !hw.array<64xi32>
  } : !seq.immutable<!hw.array<64xi32>>
 
  %mem = seq.firmem 0, 1, undefined, undefined initial %mem_random_init : <32 x 64>

  %mem_filled_init = seq.initial  {
    %0 = hw.constant 42 : () -> i32
     
    %init = hw.array_create %0, %0, %0, ...: !hw.array<64xi32>
    seq.yield %init: !hw.array<64xi32>
  } : !seq.immutable<!hw.array<64xi32>>

I think this representation can be lowered into arc.storage+arc.initial in LowerState and represent the same initialization to InitMemoryFilledOp and InitMemoryFilledOp.
I feel this approach would be more extensible and instead of preparing special operations which encode specific kinds of initialization patterns.

@@ -905,4 +952,74 @@ def VectorizeReturnOp : ArcOp<"vectorize.return", [
let assemblyFormat = "operands attr-dict `:` qualified(type(operands))";
}

def EnvironmentCallOp : ArcOp<"environment_call", [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not func.func? I think we can just create func.func @_arc_env_fill_randomized somewhere in the pipeline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now there is no difference. But when toying around with the runtime library implementation I encountered several things that could require separating environment calls from other functions:

  • It could be helpful to export a list of the environment calls a specific model needs to the JSON file. Potentially just as an indicator that the model does require the runtime library.
  • Having them separate gives us more control over the eventual lowering to LLVM IR functions and calls. This could come in handy to specify the linkage or deal with calls that are 'exceptional', e.g., terminating simulation.
  • At some point I probably want to do type marshaling, i.e, allow environment calls with "arbitrary" IR types and potentially multiple return values. This will require a dedicated transformation pass before LLVM lowering.

This is all purely hypothetical at the moment. And I'll move the EnvironmentCallOp and CallEnvironmentOp to another PR before landing this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense. I think the current use case in LowerMemoryInitializer can be representable by func.func so I would first implement with func.func.

This is all purely hypothetical at the moment. And I'll move the EnvironmentCallOp and CallEnvironmentOp to another PR before landing this one.

Thanks, I appreciated that :)

let hasCustomAssemblyFormat = 1;
}

def CallEnvironmentOp : ArcOp<"call_environment", [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference from func.call (or sim.dpi.call) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@@ -43,6 +43,37 @@ def MemoryType : ArcTypeDef<"Memory"> {
}];
}

def MemoryInitializerType : ArcTypeDef<"MemoryInitializer"> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im curious if we can just use !seq.initial<!hw.array<...>> type (or maybe just !hw.array<...>) as an initializer. With that approach we don't have to introduce special types and operations for memory initializer. I can work on migrating !seq.initial+!hw.array later so I don't block the PR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below.

Comment on lines +49 to +50
let parameters = (ins OptionalParameter<"unsigned">:$numWords,
OptionalParameter<"::mlir::IntegerType">:$wordType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By not specifying the respective attribute the initializer can be used for memories of all depths and/or word types.

@fzi-hielscher
Copy link
Contributor Author

Thanks for your comments @uenoku!

I think this representation can be lowered into arc.storage+arc.initial in LowerState and represent the same initialization to InitMemoryFilledOp and InitMemoryFilledOp.
I feel this approach would be more extensible and instead of preparing special operations which encode specific kinds of initialization patterns.

An initializer op with a region that allows accessing individual words of the memory is definitely on the list of things I'd want to add. If we could use seq.initial directly that would be even better. My concerns when just using the seq.initial op as you have described it would be:

  • A blanket initialization with constant/random values is, I'd guess, the most common use-case. When applying this to memories which may have a depth in the tens of thousands (maybe even millions, at least for simulation) requiring an equally large ArrayCreate op (and for random initialization a distinct SSA value for every single word) seems excessive to me.
  • I think it would be nice to be able to express sparse initializers and combine them. E.g., a .mem file may not initialize the entire memory. So, as a not entirely unrealistic use-case: Can we do a randomized fill first, then overlay a .mem file and finally manually change a few words?

All of this of course depends on what we can and want to express in the frontends. But based on my implementation here I could imagine a representation of the given example to look something like this:

// Use a random fill as base, producing '!arc.memory_initializer<* x *>'
%0 = arc.initmem.randomized

// Overlay a sparse .mem file, produce a type specialized initializer
%1 = arc.initmem.readmem hex "foo.mem" initial %0 : (!arc.memory_initializer<* x *>') -> !arc.memory_initializer<* x i32>

// Insert a custom value, produce a fully specialized initializer
%2  = arc.initmem.array initial %1 : (!arc.memory_initializer<* x i32>') -> !arc.memory_initializer<1024 x i32> {
  // Provide an argument containing the previous memory contents
  ^bb0 (%array: !hw.array<1024xi32>):
    %addr = hw.constant 0x123 : i10
    %val = hw.constant 0xcafe : i32
    // This op sadly doesn't exist, but can effectively be constructed using slice, create and concat
    %newInit = hw.array_inject %array[%addr], %val : !hw.array<1024xi32>
    arc.initmem.yield %newInit : !hw.array<1024xi32>
}

How useful is this? - I don't know. But I enjoy tinkering with the possibilities here. 😅 I'm in no hurry to land this. As mentioned, there is still plenty to do on the runtime library front. But I want to make sure that my stuff lines up with what you are doing in FIRRTL/Seq. The arc.initmem.array op would be pretty close to your seq.initial op example. So, again thanks for your feedback.

Copy link
Member

@maerhart maerhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory initialization and sparsity are really interesting topics and not handled well in arcilator yet. Thanks a lot for working on it!

}

def InitMemoryFilledOp : ArcOp<"initmem.filled", [Pure]> {
let arguments = (ins APIntAttr:$value, UnitAttr:$repeat);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if making the word type optional and having this repeat attribute here is a good choice.
Is the motivation just to be able to reuse the same initmem.filled for memories with different word types or is there more?
I don't think having one such op per memory word type is a problem, it's a very simple op that doesn't even have a nested region.
Having the type explicitly specified seems like a bigger benefit to me and I'd also drop the repeat attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for explicit types. I expect every memory is statically typed determined in Arc dialect level.

MemoryEffects<[MemWrite]>,
MemoryInitializerIsCompatible<"memory", "initializer">
]> {
let arguments = (ins MemoryInitializerType:$initializer, MemoryType:$memory);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to learn more about your rationale of introducing separate init mem operations for filled, randomized, etc. vs. just having a function (either a func.func or some custom version) that as as argument the memory index/address and returns a value of the word type. So, for filled it would just return a hw.constant result and for randomized we'd have a new random operation that could even live in the seq of HW dialect to be also usable in seq initializers to model firreg with compreg and get rid of the earlier.
This approach would be more general as your returned value can depend on the memory address (but we could also not have this arg if we want it more restricted).

I'd imaging that this initializer function could then also be used to read a mem file and initialize the memory according to that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, there wasn't much of a rationale other than having something to start with that is easy to implement. I'm having a lot of trouble reasoning about the flexibility and generality we may or may not need without having anything in the fronted dialects to connect to. So, maybe this PR was a tad too premature.

Having said that, an operation like you have described (kind of like Scala's tabulate if I understand you correctly) also makes perfect sense to me. But having that on top of the arc.initmem.array I've sketched above would likely be redundant and we would have to see which one aligns better with Seq.

Random initialization is a rabbit hole on its own. I haven't made my way through the entirety of the SystemVerilog LRM's Chapter 18 yet, but it gives an impression on what we could theoretically support. Initializing memories via a single op would e.g., allow us to easily provide a PRNG seed attribute for each memory instance. Then again, I would not want to engineer this separately from register randomization.

Finally, .mem files are not really suited for indexed/addressed access. While in practice they will often simply contain one word per line, they can be quite a bit more complex. We'd have to parse the entire file into a dedicated buffer first (and somehow manage the file descriptor), and at that point I don't see the benefit over just 'atomically' parsing it into the state memory. But maybe I'm missing/misunderstanding something?

@uenoku
Copy link
Member

uenoku commented Aug 30, 2024

A blanket initialization with constant/random values is, I'd guess, the most common use-case.

I see, that looks reasonable design point. I'm fine with introducing arc.initmem.randomized operation as a starting point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arc Involving the `arc` dialect
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants