-
Notifications
You must be signed in to change notification settings - Fork 2
/
NotesOnGet.html
199 lines (198 loc) · 8.89 KB
/
NotesOnGet.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<link REL=stylesheet HREF=Rtech.css>
<title>Functions `get', `assign', etc.</title>
</head>
<body>
<h1>Functions <code>get</code>, <code>assign</code>, etc.</h1>
<h3>Problems and Proposed Solutions for Functions that Access Objects</h3>
<p>
The functions <code>get</code>, <code>exists</code>, <code>and</code> assign all take a second argument that
refers, generally, to <code>where</code> the action of the function should take
place. The functions <code>objects</code> and <code>rm</code> also take similar arguments.
<p>
From the user's view, <code>where</code> can be a workspace image, an attached
package, an environment (maybe of a currently active call or maybe a
special environment). In the future we will likely want it to
represent other things as well, for example when we're interfacing to
database software.
<p>
The functions should allow any meaningful object and should then behave consistently, up to
differences in what they are doing.
<p>
Right now, that's not entirely the case. For example, to supply an
environment as an argument, you must use a separate <code>envir=</code> argument.
<p>
The following list suggests some changes for this and related problems.
Items 1 and 2 are being pushed for immediate action; the
remainder seem desirable, but are either not quite as back-compatible or
else depend on other changes.
<p>
The proposed changes for 1 and 2
have been added to the development branch.
<p>
<ol>
<li> Allow environments to be supplied directly.
<p>
There's really quite a clean concept operating here: whatever is
specified as an argument is essentially coerced to be an R
environment. Unfortunately, it's done a different way depending
on what the user supplies and on which function is called.
Numbers are passed to pos.to.env. Character strings are first
explicitly matched (by some code that relies on lazy evaluation to
work). And the simplest case, when the user supplies an
environment, requires the actual call to be different.
<p>
That last problem causes messy, inefficient, and error-prone
programming in higher-level functions that try to treat
environments and other databases uniformly.
<p>
The proposed solution is to replace the calls to <code>pos.to.env</code> by
calls to a new function, <code>as.environment</code>, which would work
uniformly for the various cases.
<p>
At the same time, this eliminates the ad hoc code dealing with
character string arguments, turning the body of get, exists, and
assign just into .Internal calls.
The claim is that this change is back-compatible with existing
code.
<p>
For efficiency, the current <code>as.environment</code>
implementation should be replaced by a version in C.
<p>
<li> The arguments to the objects function.
<p>
In principle, the first argument to objects is just another
<code>where</code> argument. The current implementation has 3 related
arguments, <code>name</code>, <code>pos</code> and <code>envir</code>, and the treatment of the
first one is, well, strange.
<pre><code>
if (!missing(name)) {
if (!is.numeric(name) || name != (pos <- as.integer(name))) {
name <- substitute(name)
if (!is.character(name))
name <- deparse(name)
pos <- match(name, search())
}
envir <- pos.to.env(pos)
}
</code></pre>
I _think_ the intention here is to allow the names on the search
list to be supplied with or without quotes, so <code>"package:base"</code> or
<code>package:base</code> would both work.
<p>
If so: Alas, it won't fly. The culprit is that expression
<code>is.numeric(name)</code>, which will have to evaluate <code>name</code>. Lots of
luck with <code>package:base</code>! Generally, you can't put a <code>substitute(x)</code>
call into anything conditional on the value of x and not expect to
go down in flames.
<p>
Because the subsitute is used unless the evaluation produces a
numeric, one can't supply an expression that evaluates to a string
or an environment.
<p>
The proposed modified version retains the basic idea, but
evaluates <code>name</code> in a try() expression and performs the subsitute
only if the try fails. This should (usually!) even work for
<code>package:base</code>.
<p>
<p>
<li> Inconsistent arguments.
<p>
The argument lists of get, exists, and assign aren't consistent
with each other, or with S-Plus. (The S-Plus arguments aren't
entirely consistent either, but closer.)
<p>
Two issues are whether it's <code>where</code> (as in exists and in S-Plus)
or <code>pos</code> (as in the other functions) and whether a separate
<code>frame</code> argument is allowed (yes in exists and in S-Plus, no
elsewhere). Where <code>frame</code> is allowed, it's equivalent to
supplying the environment sys.frame(frame), but the semantics are
a bit confusing.
<p>
3.(a) For the first issue, it would be nice to use <code>where</code> throughout,
but there is clearly a serious compatibility issue.
<p>
3.(b) For the second, one could add a frame argument and treat it
consistently by making the default expression for the
environment be
(if(missing(frame)) as.environment(where) else sys.frame(frame))
This is what the revised version of exists does.
<p>
<p>
<p>
<li> rm and remove
<p>
Right now, these are identical. That doesn't seem too useful, and
it would be mildly more convenient if remove expected a character
vector, as it does in S-Plus. It would just let one say
remove(objects(....))
rather than
remove(list = objects(....))
(On the other hand, at least both rm and remove in R take a
position argument, which is more consistent than S-Plus, where
remove does but rm doesn't.)
<p>
<p>
<li> Methods
<p>
What's really going on with the <code>where</code> argument is that we want
to make the various functions behave correctly for the database
defined by <code>where</code>.
<p>
In other words, we would like get and the other functions to have
methods based on the <code>where</code> argument.
<p>
We can't do that directly until S4-style methods are introduced,
since for most of the functions the corresponding argument is not
the first one.
<p>
As an interim solution, we could make the as.environment function
mentioned in item 1 into an S3-style generic. If it became a
primitive (as pos.to.env is now) we could dispatch it from the C
code if efficiency is a big deal here (which I doubt).
<p>
The interim solution is not as good, though, for future work,
because in some cases (imagine various database interfaces that
let you attach a database table) you don't want to create in
intermediate R environment object, but rather to go off to the
appropriate interface directly for each of the functions. For
that, you really do want methods for get, exists, etc.
<p>
<p>
<li> The -1 value for <code>pos</code> and the <code>inherits</code> argument.
<p>
In terms of the current implementation, this amounts to what
pos.to.env(-1) means. From the implementation and the error
message, it means "the environment of the parent of the call to
pos.to.env". The documentation of pos.to.env, on the other hand,
claims you have to give a positive integer as an argument.
<p>
The actual semantics are mostly OK but the situation is a bit subtle.
When no <code>where</code> or <code>pos</code> argument is given <code>exists</code> or <code>get</code>, the
intended semantics is "search for the name in this function call
and then in the search list". What seems confusing, if not
downright inconsistent, is that the <code>-1</code> argument
behaves rather differently depending on the function.<p>
For example, the functions <code>rm</code> and <code>assign</code> have
optional <code>inherits</code> arguments also. If
<code>inherits</code> is <code>TRUE</code>, then both functions
search back through parents to match names.
Since both functions are ``destructive'' to the environments they
work in, I don't think it's a great idea to have them swinging
back into, for example, the global environment from inside a
function call.
For <code>rm</code>
this is inconsistent with S-Plus, and rather nervous-making. For
<code>assign</code>, it seems bizarre--why would the
existence of an object with the same name affect the behavior of
<code>assign</code>?
(At least <code>inherits</code> is <code>FALSE</code> by default
in these functions.)
</ol>
<hr>
<address><a href="http://cm.bell-labs.com/cm/ms/departments/sia/jmc/">John Chambers</a><a href=mailto:[email protected]><[email protected]></a></address>
<!-- hhmts start -->
Last modified: Mon Aug 13 14:57:05 EDT 2001
<!-- hhmts end -->
</body> </html>