Path: news.cs.au.dk!news.net.uni-c.dk!sunsite.auc.dk!twister.sunsite.auc.dk.POSTED!not-for-mail Sender: eernst@nsu2.cs.auc.dk Newsgroups: comp.lang.beta Subject: Re: Am I missing something obvious References: <20001024092406.22334.qmail@noatun.mjolner.dk> <39F593C3.EB309B74@cepsz.unizar.es> From: Erik Ernst Message-ID: Organization: Department of Computer Science, University of Aalborg, Denmark Lines: 196 User-Agent: Gnus/5.0803 (Gnus v5.8.3) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 10 Nov 2000 16:42:41 GMT NNTP-Posting-Host: 130.225.194.46 X-Trace: twister.sunsite.auc.dk 973874561 130.225.194.46 (Fri, 10 Nov 2000 17:42:41 MET) NNTP-Posting-Date: Fri, 10 Nov 2000 17:42:41 MET Xref: news.cs.au.dk comp.lang.beta:12639 >>>>> "Alejandro" == Alejandro Villanueva <190921@cepsz.unizar.es> writes: (Long-time-too-much-work-to-read-comp.lang.beta, and then there is such an interesting discussion going on when I look.. :-) Alejandro> Well, having read section 5.10.2 of the BETA book, I'm Alejandro> told that the procedure will be inlined. The compiler can create one instance and reuse it, so "caching" might be more precise than "inlining". The BETA book does say 'similar to an inline procedure call', but that could then be described as an extra optimization opportunity in the cases where it is statically known that the object will not be used in any way. Actually, the semantics in the BETA book was never implemented (as far as I know). At the time, it seemed to be reasonable to let programmers give the compiler a hint that certain method invocations could be reused. It was entirely up to the programmer to ensure that this caching would not have ill effects (such as reusing the same "activation record" for a recursive method, thus overwriting local variables in other invocations of the same method). The fact is that the inserted items (syntactically they are actually InsertedItem or AttributeDenotation, used as an Imperative, i.e., a statement) are never cached, a new object is created every time. Just as if you had written the "&". So you may in fact just forget all about "&" in connection with method invocations. There are good reasons why you would want to keep it that way (and change the BETA book :-) - It is an optimization hint from programmers, and such hints should probably be kept out of the language; it would be better for the language to optimize provably safe cases (i.e., where it makes no difference except for better performance) and let the programmer concentrate on "real" design. - It is an unsafe optimization: If it is applied in a (directly or indirectly) recursive method it will cause very funny bugs, because of the shared-local-variables problem. I just tried this: ORIGIN '~beta/basiclib/betaenv' -- program:descriptor -- (# m: (# i: @integer enter i do (if i<10 then i+1->m if) exit i #); m2: @(# i: @integer enter i do (if i<10 then i+1->m2 if) exit i #) do 0->m->putint; newline; 0->m2->putint #) This program prints '0' and '10'. In the first case, 'm' is an ordinary recursive procedure, and it behaves as expected (as if we had used "&" before every non-defining occurrence of 'm'). In the second case, using 'm2', I've simulated the caching, and as we can see it leads to clobbering of 'i'. All the recursive invocations of 'm2' are sharing the same 'i'. - I strongly feel that such a semantics should not be the (syntactically easier) default case. If we really want to reuse the activation record of a method invocation (which is effectively what the caching would give us) then we could just specify that explicitly, just by writing and using 'm2' in place of 'm'. So it's not even hard to get the same thing when we want it. But we shouldn't get it by accident! I've reinterpreted the symbol "&" slightly in context of the language gbeta. Here, it means "must be new". So we could have this: (# m: (# do <> #); m2: @(# do <> #); do m; (* OK, create instance of 'm', execute it *) m2; (* OK, execute the existing object called 'm2' *) &m; (* OK, works like 'm', but also documents object creation *) &m2; (* ERROR! 'm2' _is_ an object, cannot create one *) #) When "&" is taken to mean "must be new", we are effectively saying "get hold of 'm', then create a new object and execute it". This is perfectly acceptable when 'm' is a pattern, and it works the same as it always has. A stand-alone 'm' would also cause the creation of a new object, and that would also be the same behavior as today. But it would not be acceptable with 'm2', because 'm2' denotes an existing object and not a new one. The difference is that "&" is used to tell the programmer that there _must_ be a new object involved. This works as a call-site mark (both as documentation and with automatic checking by the compiler), to ensure that it is indeed possible to obtain a new object at that point. So it's a signal from one programmer to another that "I really depend on this being a new object, every time". Since sharing of state is semantically significant, it makes sense that programmers be able to specify this kind of constraint. If you as a programmer do not specifically insist on having a new object every time, then just leave out the "&". In that case it will be possible for other people to change the program in such a way that the "invoked method" becomes a reused activation record. Just edit the declaration to make it look like 'm2' above. With this approach, "&" is used in a backward compatible way, and it enables programmers to require something semantically useful ("a new object every time"). A compiler would then be allowed to do caching, inlining and whatever _only_ in cases where it would provably not change the visible behavior of the program. Alejandro> This is ok for Alejandro> something like: Alejandro> P: (# Alejandro> I, J: @Integer; Alejandro> enter (I, J) Alejandro> do I+J -> I Alejandro> exit (J, I) Alejandro> #) Alejandro> TEST: @(# Alejandro> N, M: @Integer; Alejandro> do Alejandro> (2, 3) -> P -> (N, M); Alejandro> #) Yes. Alejandro> [..] But... what about this one: Alejandro> Q: (# Alejandro> A, B: @Integer; Alejandro> PP: (# Alejandro> I, J: @Integer; Alejandro> enter (I, J) Alejandro> do A+I -> A; B+J -> B; Alejandro> exit (A, B) Alejandro> #) Alejandro> #) Alejandro> TEST: (# Alejandro> Q1: @Q; Alejandro> N, M: @Integer; Alejandro> do Alejandro> 3 -> Q1.A; Alejandro> 5 -> Q2.B; ^ 1, I presume? Alejandro> (1, 2) -> Q1.PP -> (N, M); Alejandro> #) This is actually just fine. You are accessing the 'A' attribute of the 'Q' object, etc. That's essentially the same as class Point { public int x,y; } Point p = new Point(); p.x = 3; p.y = 5; 'Q1.PP' is an "inserted item", and that would allow the compiler to create and reuse an instance of the pattern Q1.PP (with the old interpretation of inserted item). Such an instance would be nested inside the object Q1, so it would work on the 'A' and 'B' attributes of Q1. This would be a case where the object caching makes no difference, so even according to my (stricter but safer) semantics, it could be cached. It could not be inlined as source code, because that would break the link between name applications like 'A' and the associated declarations in the enclosing instance of 'Q'. In general, we cannot move code around (such as by inlining or whatever) and expect it to have the same meaning, because name applications may resolve to entities declared in enclosing scopes. But that is a general property of inlining---we cannot expect to be able to inline code without modifying it in such a way that it works the same way after being moved. Whether this modification happens on source code, abstract syntax trees, intermediate language, or whatever, that is a matter of implementation. Alejandro> where Q1 is not a pattern, but a pattern instance? How Alejandro> to inline it? What's the final value of N and M? and Alejandro> why? What's the difference Alejandro> if I wrote (1, 2) -> &Q1.PP -> (N, M) instead? That makes no difference, in this special case. Well, you could not use a pattern for Q1, because an expression like 'Q.PP' is an error (as a statement, at least). If you are "dotting into" anything, then that anything had better be an object. But since "&" would apply to 'PP' anyway in 'Q1.PP', it makes no difference. regards, -- Erik Ernst eernst@cs.auc.dk Department of Computer Science, University of Aalborg, Denmark