- Aug 2023
-
-
The new venture’s approach contrasts with that of companies like OpenAI, which might feed all its data into one large AI program, rather than a series of littler ones.
-
, Sakana’s approach could potentially lead to AI that’s cheaper to train and use than existing technology. That includes generative AI
Influence on costs
-
Sakana is still in the early stages: It hasn’t yet built an AI model and doesn’t have an office.
Very early stage.
-
The startup plans to make multiple smaller AI models, the kind of technology that powers products like ChatGPT, and have them work together. The idea is that a “swarm” of programs could be just as smart as the massive undertakings from larger organizations.
This sounds similar to our "bottom-up AI" approach
||JovanK||||sorina||||anjadjATdiplomacy.edu||
-
- Jul 2023
-
deliverypdf.ssrn.com deliverypdf.ssrn.com
-
Example Promp
Important: basic prompt structure according to authors, regardless the AI implementation: - Role and Goal - who AI is, and what it suppose to do - Constraints - instructions for prevention from unexpected AI actions - Step-by-step instructions - Chain-of-thought instruction - Personalization (optional) - instruct AI to ask for additional information from students (their interest, level of prior knowledge, etc) - Pedagogy (optional) - instructions to AI how to behave in more pedagogic way - Specific instructions (optional) - such as to present students with summary, ask them for reflections, etc
AI Team suggestions for additional prompt steps/techniques: - evaluation and validation (additional step) - ask AI to check its answers if they may be offensive, inappropriate, not alilgned with Diplo stances, etc - Retrieval Augmented Generation (RAG) (additional step and technique) - ask AI to generate answer based on information retrieved from Diplo materials - usage of inner monologue (technique)- ask AI to discuss some question with itself in order to get to deeper analysis, before sending the answer to students - usage of chaining prompting (technique) - ask AI to break complex question/task into multiple simple tasks
-
Large Language Models and Prompt Compatibility
Very important! Test it on ChatGPT4 connected on Diplo knowledge base via custom made Retriever
Also, not tested on: - Claude 2 - Perplexity We do not have access to Claudie2 yet, and there is no API access to Perplexity. We could test Perplexity manually
-
Students should report out their interactions with the AI and write a reflection about the guidance and help the AI provided and how they plan to incorporate (or not) the AI’s feedback to help improve their work.
Important: Expectations from students in the process of AI as Mentor implementation.
-
Is can be very convincing, and have strong “viewpoints” about facts and theories that the models “believe” are correct.
A bit confusing, it seems to me that this is same as Bias risk, but applied in teaching process.
-
While ChatGPT offers a privacy mode that claims to not use input for future AI training, the current state of privacy remains unclearfor many models, and the legalimplications are often also uncertain.
Data privacy issue with models
-
AI is trained on a vast amount of text, and then receive additional training from humansto create guardrails on LLM output.
Model training + RLHF (Reinforcement learning through Human Feedback)
-
GPT-4 (accessiblevia ChatGPT Plus orMicrosoft Bing in Creative Mode) is the only modelthat consistently executes on the given prompts.
Advantage of GPT4 compared to other models
-
This approach helps to sharpentheir skillswhile having the AI serve as a supportive tool for their work, not a replacement. Although the AI’s output might be deemed “good enough,” students should hold themselves to a higher standard, and be accountable for their AI use.
Also Diplo approach to AI. Very important and relevant to other Diplo tasks, such as writing updates
-
- Apr 2023
-
dev-ai.diplomacy.edu:8541 dev-ai.diplomacy.edu:8541
-
Cable breaks and damage happen all the time on UK broadband networks the size of Openreach’s, and they’re usually caused by accidents, such as third-party contractors ramming their diggers into the wrong patch of ground. But only very rarely do we see damage as bad as the one caused by a landslip in Kent this week.
||JovanK||
This seems interesting, please reflect
-
- Jun 2022
-
facctconference.org facctconference.org
-
hrough line-by-line content analysis, we identify 59 values that areuplifted in ML research, and, of these, we find that the papers mostfrequently justify and assess themselves based on Performance,Generalization, Quantitative evidence, Efficiency, Building on pastwork, and Novelty. We present extensive textual evidence and iden-tify key themes in the definitions and operationalization of thesevalues. Notably, we find systematic textual evidence that these topvalues are being defined and applied with assumptions and impli-cations generally supporting the centralization of power. Finally,we find increasingly close ties between these highly cited papersand tech companies and elite universities.
-
- Apr 2022
-
writer.com writer.com
-
GPT-3 alone doesn’t fact-check or verify sources. This could be a huge problem for content teams that rely on accuracy of information; for example, newsrooms or teams that publish reports like whitepapers. The text GPT-3 generates shows biases. As ethics researcher and Google whistleblower Timnit Gebru revealed, large language models like GPT-3 are exposed to the most harmful content on the internet. That means it can spit out racist tropes and gendered language. While GPT-3 is smart, it’s still a mindless piece of technology. Think of the tool as a writing parrot: With just a few examples, it can combine words in a multitude of ways to mimic human writing, but it doesn’t always make sense of what it’s writing. As you’ll see in a moment, the result is often a lot of textual “noise” without a “why” behind what it says.
Unfortunately, when trying to use the technology for something more concrete, we stumble upon these drawbacks, which show to be essential.
-
- Mar 2022
-
www.compose.com www.compose.com
-
The field_value_factor lets us use the value of an indexed field in the document to impact the document's relevance score. For example, if we had blog articles where we tracked the number of "likes" each received, we could use the value of the "likes" field to increase the score of the most popular articles.
-
Function scoring is Elasticsearch's toolbox of functions that allows us to manipulate relevance scores beyond what we've already looked at. There are many functions, but let's look at a couple of the most useful for dialing in our top 10 list. Note that you can even write your own custom function if none of the predefined ones work for your situation!
-
constant_score lets you negate the built-in scoring mechanisms (such as those described in our article on scoring in Elasticsearch) for whatever query or filter it wraps. This lets you match documents with particular characteristics, but be able to manually set the score received for matches by using a boost. If you do not add a boost or you use a boost of 1, then documents that match the characteristics will not have their scores affected by the match. If you set a boost higher than 1, then the document's relevance score will be raised accordingly.
-
The boosting query type lets us use a negative boost to downgrade documents where certain terms are matched that we believe will cause the document to less relevant. For example, someone searching for "pumps" in a catalog might mean the tool, not the women's shoe. We could use a boosting query to retrieve all documents where there was a match for "pumps", but then give a negative boost to the ones containing terms like "shoes" or "heels".
-
When using a query type that is looking for matches in more than one field, you can boost the weight of matches found in specific fields. For example, you can specify a field boost of 2 for the "title" field to indicate that matches found in the title are always twice as relevant as matches from other fields.
-
Beyond the built-in mechanisms, there are tools we can choose to apply at query time to affect a document's relevance score.
-
boolean query types take the hierarchical structure of the query and the number of conditionals into consideration when calculating the relevance score for matched documents. For the dismax query type, additionally the combination of term-field matches will impact the relevancy score. So, depending on the query type you choose and how you structure it, the Practical Scoring Function will be applied somewhat differently
-
-
www.compose.com www.compose.com
-
How do you combat the sharding effect? There are a couple different ways. Document routing: You can use document routing to make sure documents from a single index all go to the same shard by using the value of a specified field. This assumes that your searches will be performed against a single index or on multiple indexes that live on the same shard. You'll want to use the routing field in your search request as well as at index time. Search type: Search type lets you specify an order of events you want the search to perform. For this situation the "dfs_query_then_fetch" will solve our problem. It will query all the shards to get the frequencies distributed across them, then perform the calculations on the matching documents.
-
Compose Elasticsearch deployments include 5 shards automatically. When we indexed our documents, we didn't make any specification about how sharding should be applied so the documents got doled out evenly across each of the shards - 50 documents on each of our 5 shards = 250 documents.
-
All else being equal, a document found on a shard with more total documents would be scored lower than a document on a shard with less total documents. A document found on a shard with more additional matching documents would be scored lower than one found on a shard with lower or no additional matching documents. Not so good.
-
We actually have two sets of details - one for the query weight and one for the field weight
-
Note that term frequency, inverse document frequency, and field-length normalization are stored for each document at index time.
-
It can also be used to boost a particular index if you're searching across multiple indexes and want one to have more importance
-
Query boosting allows us to indicate that some part(s) of the query should be more important than other parts.
Deluje vazno!
-
Query boost: This is a percentage or absolute number that can be used to boost any query clause at query time.
-
Index boost: This is a percentage or absolute number used to boost any field at index time.
-
Query normalization is used so that different queries can be compared. For any individual query, however, it uses the same score for every document (effectively negating its impact within an individual query)
ne utice na scoring unutar querija
Sluzi da bi skorovi razlicitih querija bili uporedivi
-
Query normalization (queryNorm): This is typically the sum of squared weights for the terms in the query.
-
For field length normailization, a term match found in a field with a low number of total terms is going to be more important than a match found in a field with a large number of terms.
-
Field length normalization (norm): This is the inverse square root of the number of terms in the field:
-
Like term frequency, coordination can be turned off, but that is typically only done when the terms are synonymous with each other (and therefore, having more than one of them does not increase relevancy). A better way to handle that situation, though, is to populate a synonym file to handle synonyms automatically.
-
Coordination (coord): Counts the number of terms from the query that appear in the document.
-
Inverse document frequency (idf): This is one plus the natural log (as in "logarithm", not "log file") of the documents in the index divided by the number of documents that contain the term
-
Term frequency clearly assumes that the more times a term appears in a document, the higher its relevancy should be.
-
Term frequency (tf): This is the square root of the number of times the term appears in the field of a document:
-
it uses Lucene's Practical Scoring Function. This is a similarity model based on Term Frequency (tf) and Inverse Document Frequency (idf) that also uses the Vector Space Model (vsm) for multi-term queries.
-
For example, a user searching on "apple" could mean the company or the fruit, but matches may occur on documents for both the company and for the fruit.
ES ne hvata kontekst. Weaviate bi trebalo da hvata i kontekst ||anjadjATdiplomacy.edu||
-
Before Elasticsearch starts scoring documents, it first reduces the candidate documents down by applying a boolean test - does the document match the query?
-
- Nov 2021
-
early-access-program.debater.res.ibm.com early-access-program.debater.res.ibm.comExamples3
-
we use Index Searcher service to collect arguments relevant to topic.
-
The general strategy is to first collect potentially relevant sentences from Wikipedia, then distill them to claims and evidence, and finally generate the speech.
-
Our topic for this example is: We should ban algorithmic trading, and our polarity is con. (Namely, we oppose the idea of banning algorithmic trading). The dominant concept (dc) of our topic, i.e., the Wikipedia title that features the main subject in our topic is Algorithmic trading. The dominant concept can be found by our Term Wikifier service, when fed with the topic.
-
-
dev-ai.diplomacy.edu:8522 dev-ai.diplomacy.edu:8522Dash2
-
that we can cheer you up o
-
Also. Also, with that process. My name is Aras Aras Ava from deploy and the Geneva internet platform, and I'm also very proud neck member and I will be your moderator for tonight's event. I will no
-
-
nimani.diplomacy.edu:8522 nimani.diplomacy.edu:8522Dash3
-
rders is an algorithmic Marvel, so big props to the cash app Engineers, for solving a hard problem that in the end. Provides an easy interface that takes a step up to the next layer of abstraction over the stock market makin
-
Well, you have this Universal turing machine and maybe The the brain is something like that.
-
This is the artificial intelligence podcast
Good podcast!
-
- Oct 2021
-
dev-ai.diplomacy.edu:8509 dev-ai.diplomacy.edu:8509Dash1
-
While cyberspace is hardly terra nullius for the purpose of international law, the application of existing laws do present problems given the trans-border nature of many actions online. National law regulates much online activity, most robustly (but not exclusively) when the relevant acts take place within a national territory. T
test tag
-
- Aug 2021
-
arxiv.org arxiv.org
-
For examples, work has been done in making personalized dialogue response bytaking account of persona [144] and emotion [147], controlling various aspects of the responsesuch as politeness [92], grounding the responses in external source of knowledge [27,41,148] andgenerating topic-coherent sequence
-
hese tasks can becombined as auxiliary tasks with the text generation task, resulting in amulti-task learningsetting
-
Alternatively, one can directly derivethe text generation targets from knowledge and use them to supervise standard text generationtask. The approach is calledweakly-supervised learning
-
Knowledge as target.The first category of knowledge-related tasks creates learning targetsbased on the knowledge, and the model is trained to recover the targets.
-
Learning with knowledge-related tasks
-
Copy and pointing mechanisms are used to choosesubsequences in the input sequence and put them at proper places in the output sequence.
-
A generation process is regarded as a sequential multi-label classification problem.It can be directly optimized by the negative log likelihood (NLL) loss.
-
The decoder is to decode a given fixed length vector representation into a variablelength sequence [113]. Specially, the decoder generates an output sequence one token at each timestep. At each step the model is auto-regressive, consuming the previously generated tokens asadditional input when generating the next token.
-
primary challengein knowledge-enhanced NLG is how toobtainuseful related knowledge from diverse sources. There has been a rising line of work that discoversknowledge from topic, keyword, knowledge base, knowledge graph and knowledge groundedtext.The second challengeis how to effectivelyunderstandandleveragethe acquired knowledge tofacilitate text generation.
-
Many existing knowledge-enhanced text generation systems have demonstrated promisingperformance on generating informative, logical, and coherent texts.
-
External knowledgeacquisition occurs when knowledge is provided from outside sources, includingbut not limited to knowledge base, external knowledge graph, and grounded text.
-
text(s), including but not limited to keyword, topic, linguistic features, and internal graph structure
-
nternal knowledgecreation takes place within the input
-
the performance of generation is still far fromsatisfaction in many real-world scenarios. For example, in dialogue systems, conditioning on only theinput text, a text generation system often produces trivial or non-committal responses of frequentwords or phrases in the corpus [87,135,148], such as“Me too.”or“Oh my god!”given the inputtext“My skin is so dry.”These mundane responses lack meaningful content, in contrast to humanresponses rich in knowledge. In comparison, humans are constantly acquiring, understanding, andstoring knowledge frombroader sourcesso that they can be employed to understand the currentsituation in communicating, reading, and writing. For example, in conversations, people often firstselectconcepts from related topics(e.g., sports, food), then organize those topics into understandablecontent to respond; for summarization, people tend to write summaries containingkeywordsusedin the input document and perform necessary modifications to ensure grammatical correctness andfluency; in question answering (QA), people usecommonsenseorprofessional knowledgepertainedto the question to infer the answer. Therefore, it is often the case that knowledge beyond the inputsequence is required to produce informative output text.
Excellent explanation why NLG models need additional knowledge!
-
or example, machine translation generates text in a different language based on thesource text; summarization generates an abridged version of the source text to include salientinformation; question answering (QA) generates textual answers to given questions; dialoguesystem supports chatbots to communicate with humans with generated responses.
-
LG aims atproducing understandable text in human language from linguistic or non-linguistic data in a varietyof forms such as textual data, numerical data, image data, structured knowledge bases, and knowl-edge graphs. Among these, text-to-text generation is one of the most important applications andthus often shortly referred as “text generation”.
-
We divide different knowledge sources into internal knowledge and external knowledge. Internalknowledge creation takes place within the input text(s), while external knowledge acquisition occurs whenknowledge is provided from outside sources (e.g., Wikipedia, ConceptNet
-
A Survey of Knowledge-Enhanced Text Generation
Excellent survey on text generation enhanced by external knowledge
Tags
Annotators
URL
-